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Abstract 


YEOMANS,  KEVIN  DEAN.  Initialization  Issues  in  General  Differential  Algebraic 
Equation  Integrators.  (Under  the  direction  of  Dr.  Stephen  L.  Campbell.) 

The  past  several  years  have  yielded  much  research  into  the  development  of  general 
numerical  integrators  for  nonlinear  unstructured  higher  index  differential  algebraic 
equations  (DAEs)  of  the  form  F(y',y,t)  =  0  where  Fy>  is  identically  singular.  These 
methods  are  based  on  integrating  an  implicitly  defined  ODE  produced  by  forming 
the  derivative  array  equations  G(y',w,y,t )  =  0.  This  larger  nonlinear  system  may 
have  components  that  are  not  uniquely  determined.  Prior  work  has  examined  the 
theoretical  aspects  of  these  methods.  There  has  also  been  considerable  work  done  on 
the  efficient  implementation  of  these  integrators. 

This  thesis  will  examine  two  initialization  problems  that  arise  when  numerically 
solving  DAEs.  One  problem  is  the  computation  of  the  consistent  initial  conditions 
required  to  begin  the  integration  of  DAEs.  Various  line  search  strategies  will  be 
compared  and  examples  provided  showing  where  these  strategies  are  needed. 

The  second  problem  is  the  initialization  of  the  iterative  solvers  used  during  the 
integration  of  a  DAE.  Integration  of  the  implicitly  defined  ODE  depends  on  solving 
the  nonlinear  system  G  =  0  at  each  time  step.  For  fully  implicit  nonlinear  systems 
the  possible  effects  of  polynomial  prediction  on  the  nonunique  components  will  be 
examined.  A  complete  analysis  is  given  in  the  case  of  higher  index  linear  time  varying 
DAEs.  It  is  shown  that  the  standard  ODE  theory  does  not  hold  and  a  different 
prediction  strategy  must  be  used. 
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Chapter  1 


Differential  Algebraic  Equations 


Recently  much  research  has  been  done  in  developing  general  numerical  methods  for 
solving  differential  algebraic  equations  (DAEs) 

F(y',y,t)  =  0  (1.1) 

where  Fy>  is  identically  singular.  These  systems  of  differential  and  algebraic  equations 
arise  in  a  variety  of  applications  which  includes  constrained  variational  problems, 
prescribed  path  control  problems,  network  modelling  problems,  model  reduction  for 
problems  with  small  parameters,  and  discretization  of  partial  differential  equations  by 
the  method  of  lines  or  the  method  of  moving  grids.  A  detailed  survey  of  applications 
and  examples  can  be  found  in  [13,  25,  50]. 

These  general  methods  discussed  in  Section  1.2  integrate  an  implicitly  defined 
ordinary  differential  equation  (ODE)  and  have  the  advantage  that  (1.1)  does  not 
require  any  special  structure.  Additionally  these  methods  have  been  demonstrated 
to  work  on  higher  index  DAEs.  Briefly  the  index  is  a  measure  of  how  singular  a  DAE 
is  and  will  be  defined  shortly.  An  ordinary  differential  equation  (ODE)  is  index  zero. 
The  higher  the  index  the  more  complex  behavior  of  the  DAE.  Most  of  the  literature 
on  numerical  methods  for  DAEs  is  confined  to  systems  whose  index  is  three  or  less. 

These  general  approaches  [22,  24,  35]  require  the  solution  of  an  enlarged  system 
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of  nonlinear  equations.  The  iterative  solution  of  these  equations  can  have  several 
consequences.  First  is  the  need  for  sufficiently  accurate  initial  values.  This  can 
force  the  integrator  to  either  use  higher  order  methods  or  take  smaller  steps  than 
the  solution  of  the  DAE  would  appear  to  need.  In  this  thesis  we  investigate  ways 
to  globalize  the  iterative  solver.  This  has  important  consequences  for  the  consistent 
initialization  of  higher  index  DAEs.  Additionally  more  robust  step  size  strategies  are 
possible.  We  also  investigate  the  use  of  polynomial  interpolation  to  provide  more 
accurate  prediction  as  an  aid  in  solving  these  systems.  The  goal  of  these  efforts  is  to 
aid  in  the  development  of  these  more  general  DAE  solvers. 

1.1  Introductory  Survey 

We  begin  by  surveying  the  current  knowledge  about  the  theory  and  numerical  solution 
of  DAEs. 

1.1.1  Basic  Theory 

To  understand  and  illustrate  some  properties  of  DAEs,  we  begin  with  the  following 
simple  example 

zz'  +  y  =  f(t)  (1.2a) 

*  =  g(t).  (1.2b) 

Equation  (1.2)  consists  of  a  differential  equation  (1.2a)  and  an  algebraic  equation 
(1.2b),  which  can  be  thought  of  as  an  algebraic  constraint.  The  Jacobian  of  (1.2) 
with  respect  to  y',  z'  is 

0  z 
0  0 
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which  is  identically  singular  for  all  t.  By  differentiating  (1.2a) 

to  obtain  the  enlarged  nonlinear  system 

once  and  (1.2b)  twice 

zz'  +  y  =  f(t) 

(1.3a) 

ZZ"  +  {Z'Y  +  y'  =  f'(t ) 

(1.3b) 

^  =  g(t) 

(1.3c) 

II 

(1.3d) 

=  g"(t) 

(1.3e) 

we  can  solve  for  y'  and  z'  to  obtain  the  ODE 

II 

1 

1 

to 

(1.4a.) 

j  =  9'. 

(1.4b) 

The  solution  to  (1.2)  is 

1 

II 

(1.5a) 

z  —  g- 

(1.5b) 

It  is  interesting  to  note  that  if  g  is  not  continuously  differentiable,  then  the  solution 
component  y  may  in  fact  be  discontinuous.  From  this  simple  example  we  can  observe 
a  few  properties  of  DAEs: 

(1)  solutions  to  DAEs  reside  on  and  form  manifolds; 

(2)  solutions  depend  on  derivatives; 

(3)  not  all  initial  conditions  of  the  DAE  (1.2)  admit  a  smooth  solution,  those  initial 

values  which  do  are  referred  to  as  consistent  initial  conditions', 


(4)  there  are  hidden  constraints. 
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Properties  (1)  and  (2)  are  illustrated  by  the  solution  (1.5).  In  fact,  the  solution  to 
a  DAE  may  depend  on  derivatives  of  the  coefficients  of  the  state  variables.  However, 
that  is  not  illustrated  by  this  example.  It  is  well  known  that  solutions  to  DAEs 
form  submanifolds  of  the  state  space  [77].  The  theory  for  initial  value  problems 
applied  to  ODEs  says  that  any  initial  condition  for  which  the  right  hand  side  of  (1.4) 
is  Lipschitz  continuous  will  have  a  unique  solution  [61].  However,  not  every  initial 
condition  to  (1.4)  will  admit  a  solution  to  the  DAE  (1.2).  By  property  (3)  it  is  meant 
that  initial  conditions  must  satisfy  both  the  ODE  (1.4)  and  the  algebraic  constraints 
(1.5)  defined  by  the  DAE.  Since  solutions  reside  in  and  form  manifolds,  only  points 
on  these  surfaces  can  reside  on  solution  curves.  Such  points  are  called  consistent.  If 
we  multiply  (1.3d)  by  z  and  subtract  this  result  from  (1.3a)  we  get  (1.5a)  which  is  a 
constraint  along  with  (1.5b)  that  explicitly  defines  the  solution  manifold  to  the  DAE, 
illustrating  property  (4).  Later,  the  index  of  a  DAE  will  be  defined.  It  is  known  that 
higher  index  DAEs  will  always  include  hidden  constraints.  These  properties  illustrate 
some  of  the  important  differences  between  solving  ODEs  and  DAEs. 

Next,  we  define  what  we  mean  for  a  DAE  to  be  solvable.  Basically  y(t )  is  a  solution 
of  (1.1)  on  an  interval  /  if  it  is  continuously  differentiable  and  satisfies  (1.1)  for  all 
t  €  /.  The  following  definition  is  found  in  [13,  27]. 

Definition  1.1.1  Let  I  be  an  open  subinterval  of  1R.  Cl  a  connected  open  subset  of 
fft2s+1,  and  F  a  differentiable  function  from  Cl  to  Rs.  Then  the  DAE  (1.1)  is  solvable 
on  I  in  Cl  if  there  is  an  r- dimensional  family  of  solutions  <f>(t,c)  defined  on  a  connected 
open  set  I  x  Cl,  Cl  C  Itr,  such  that: 

1.  <f>(t,  c)  is  defined  on  all  of  I  for  each  c  6  Cl; 

2.  (<?/(f,c),</>(t,c),f)  G  fl  for  ( f,  c )  e  I  x  Cl; 

3.  If  ip(t)  is  any  other  solution  with  (if'{f),if{t),t)  €  Cl,  then  if(t)  =  (f>(t,c)  for 
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some  c  e  Si; 

The  graph  of  <f  as  a  function  of  (t,c)  is  an  r  +  1-dimensional  manifold. 

Intuitively  the  DAE  (1.1)  is  solvable  in  an  open  set  ft  C  R2s+1  if  the  graphs  (y',y,t) 
of  the  solutions  form  a  smooth  manifold  0  called  the  solution  manifold  and  solutions 
are  uniquely  determined  by  their  value  yo  at  any  to  such  that  (y'o,yo,to)  £  D  [27].  It 
is  possible  that  r  =  0  in  which  case  the  DAE  has  exactly  one  solution  as  in  (1.2). 

We  have  already  observed  that  the  solution  to  DAEs  depends  on  differentiation. 
The  next  key  definition  is  that  of  the  uniform  differentiation  index  [13,  26]. 

Definition  1.1.2  The  minimum  number  of  times  that  all  or  part  of  (1.1)  must  be 
differentiated  with  respect  to  t  in  order  to  determine  y'  as  a  continuous  function  of 
y,t  is  the  index,  v,  of  the  DAE  ( 1.1). 

Example  (1.2)  is  an  index  two  DAE.  Higher  index  problems  are  traditionally  more 
difficult  to  solve  numerically.  There  are  many  equivalent  definitions  for  the  index 
of  a  linear  time  invariant  DAE.  For  nonlinear  DAEs  there  are  several  definitions  of 
index  that  are  not  equivalent  [26].  The  uniform  differentiation  index  is  a  computable 
quantity  for  moderately  sized  problems  and  is  closely  related  to  establishing  solvability 
[27], 

The  following  special  structural  forms  for  DAEs  are  frequently  cited  in  the  liter¬ 
ature 

•  Fully  Implicit  DAE 

F(y',y,t)  =  0  (1-6) 

y'  = 


•  Semi-Explicit  DAE 


0  =  g(x,y,t) 


(1.7a) 

(1.7b) 
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•  Linear  Time  Invariant  DAE 

Ey'  +  Fy  =  f(t)  (1.8) 

•  Linear  Time  Varying  DAE 

E(t)y'+F(t)y=m  (1.9) 

•  Hessenberg  Index  r  DAE 

y'l  =  fi (j/i)  Vii  •  •  •  i  Vri  t)  (1.10a) 

y'2  =  /2(yi,ya,---,yr-i,i)  (l.iob) 

y'i  =  (1.10c) 

y(._i  =  fr-i{yr-2,yr-ut)  (l.lOd) 

0  =  fr(yr-i,t )  (l.lOe) 

where  6  Rn‘ ,  for  t  =  1 , . . . ,  r  and 

dfr  dfr-i  dh  dfi 

dyr- 1  %-a  5yi  dyr 

is  nonsingular. 

The  theory  for  (1.8)  is  well  established.  In  [13]  necessary  and  sufficient  conditions 
for  (1.8)  to  be  solvable  are  expressed  in  terms  of  a  matrix  pencil.  Given  matrices 
E,  F  and  A  €  C,  then  A E  +  F  is  called  a  matrix  pencil.  A E  +  F  is  said  to  be  a  regular 
pencil  if  the  determinant  is  not  identically  zero  as  a  function  of  A.  (1.8)  is  solvable 
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if  and  only  if  A E  +  F  is  a  regular  pencil  [13].  If  (1.8)  is  solvable  then  there  exists 
nonsingular  matrix  P,  Q  so  that  we  can  rewrite 

Ey'  +  Fy  =  /(<) 


as 


PEQx'  +  PFQx  =  Pf(t )  =  g{t)  (1.11) 

where 


I  0 

i 

Q 

o 

_ 1 

PEQ  = 

0  N 

,  PFQ  = 

0  I 

N  is  a  nilpotent  matrix  whose  index  is  the  same  as  the  uniform  differentiation  index 
defined  earlier.  The  decoupled  system  (1.11) 


x[  +  Cx  i  =  #i(t) 

(1.12a) 

Nx'2  +  x2  =  g2(t) 

(1.12b) 

can  then  be  easily  solved.  Equation  (1.12a)  is  an  explicit  ODE  that  has  a  unique 
solution  for  any  initial  value  of  Xi  and  differentiable  forcing  function  g\{t).  The 
unique  solution  to  (1.12b)  is 

x2  =  (ND  +  /)-'  g2(t)  =  £(-l )<&$>({) 

i=0 

where  k  is  the  index,  or  degree  of  nilpotency,  of  N,  and  D  is  the  differentiation 
operator.  Again,  we  see  that  initial  values  for  X2  are  completely  determined. 

For  (1.9),  we  have  that  if  E(t),  F(t )  are  real  analytic,  then  (1.9)  is  solvable  if  and 
only  if  we  can  rewrite  the  system  using  linear  time  varying  coordinate  changes  as 


x[  +  C(t)x  2  =  gi(t) 
N{t)x'2  +  x2  =  g?,{t). 


(1.13a) 

(1.13b) 
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The  unique  analytic  solution  to  (1.13b)  is 

*,(()  =  £  (-N(t)Df  (1.14) 

i= 0 

The  difficulty  is  computing  the  time  varying  coordinate  transformations  that  will 
transform  (1.9)  into  the  equivalent  system  (1.13).  In  [8,  21,  75],  more  results  are 
given  which  establish  the  existence  and  uniqueness  of  solutions  to  (1.9).  In  Section 

1.2  we  give  assumptions  (A1)-(A4)  that  are  computationally  verifiable  and  equivalent 
to  solvability  for  every  sufficiently  smooth  function  f(t). 

Additional  existence  and  uniqueness  results  for  solutions  to  (1.6)  and  other  struc¬ 
tural  forms  are  given  in  [72,  73,  76,  77].  Most  of  these  results  are  established  from 
a  differential-geometric  approach.  This  approach  is  based  on  the  observation  that 
DAEs  are  locally  equivalent  to  ODEs  defined  on  a  constraint  manifold. 

1.1.2  Numerical  Methods 

Most  of  the  numerical  literature  to  date  discusses  numerical  methods  for  DAEs  whose 
index  is  less  than  three  or  DAEs  with  special  structure  such  as  Hessenberg.  Some 
of  the  earliest  numerical  methods  for  DAEs  were  backward  differentiation  formulas 
(BDF)  applied  to  semi-explicit  index  one  systems  [45].  Later  BDF  methods  were 
extended  to  fully  implicit  index  one  systems 

F(y',y,t)  =  0.  (1.15) 

The  fc-step  constant  step  size  BDF  replaces  y'  by  a  polynomial  which  interpolates 
the  computed  solution  at  i, . . .  to  yield 


Equation  (1.16)  is  then  solved  by  Newton’s,  or  other  iterative  methods,  for  yn.  Theory 
for  the  convergence  of  the  fc-step  ( k  <  7)  constant  step  size  BDF  methods  has  been 
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proven  in  the  case  of  linear  constant  coefficient,  fully  implicit  index  one,  semi-explicit 
index  two  and  Hessenberg  index  two  and  three  systems  [13].  The  results  are  also 
true  for  variable  step  BDF  methods  except  in  the  Hessenberg  index  three  case.  The 
BDF  code  DASSL  [13,  71]  is  one  of  the  most  widely  recognized  DAE  production 
codes  for  solving  index  one  systems.  Several  variants  of  DASSL  have  recently  been 
developed.  These  extensions  include  DASPK  [16]  for  solving  large-scale  DAE  systems 
and  DASSLSO  [66]  for  sensitivity  analysis  of  DAE  systems.  The  linear  system  at  each 
step  of  the  integration  is  solved  by  the  use  of  a  preconditioned  GMRES  [55]  iterative 
solver  in  the  PK  variants.  Another  well  known  BDF  code  for  index  one  DAEs  is 
LSODI  [53]. 

Implicit  Runge-Kutta  (IRK)  methods  have  been  studied  extensively  in  the  case 
of  Hessenberg  index  one,  two  and  three  systems  [50] .  An  M-stage  method  applied  to 
(1.15)  is  given  by 

F  ^F/,  yn- 1  +  h  OijY j,  tn-i  +  Cihj  =0,  i  =  1, 2,  •  •  • ,  M 

M 

Dn  =  2/n— 1  T  h  ^  [  bjYj 

t=l 

where  h  =  tn  —  \ ,  Y-  are  estimates  for  y'(tn- 1  +  C{h )  and  are  called  state  derivatives, 

and  a,ij,Ci,bi  are  the  coefficients  for  the  method  [13].  Theory  for  the  fully  implicit 
index  one,  semi-explicit  index  one  and  two  cases  can  be  found  in  [13,  50].  RADAU5 
is  an  implicit  Runge-Kutta  code  that  will  solve  Hessenberg  systems  of  index  one  to 
three  [50]. 

Linear  multistep,  one-leg  and  implicit  Runge-Kutta  methods  are  extensively  ex¬ 
amined  and  analysed  in  [49,  67].  However,  it  is  assumed  the  nullspace  of  Fy>  in  (1.1) 
depends  only  on  t  or  is  constant. 

Extrapolation  methods  have  also  been  applied  to  the  solution  of  index  one  DAEs. 
LIMEX  [42]  is  one  example  of  a  code  that  has  been  developed  using  extrapolation. 


Chapter  1.  Differential  Algebraic  Equations 


10 


MEXX  [64]  is  another  extrapolation  method  based  on  half-explicit  Euler  or  half¬ 
explicit  midpoint  formulas  for  solving  the  equations  of  motion  of  constrained  multi¬ 
body  systems.  More  discussion  can  be  found  in  [13]. 

The  numerical  solution  of  algebraically  explicit  DAEs  is  examined  in  [78,  79].  It 
is  assumed  that  all  the  algebraic  constraints  or  the  algebraic  variables  are  explicitly 
defined.  Existence  proofs  for  several  types  of  DAEs  of  index  one  to  three  are  given. 

A  specialized  algorithm  for  the  numerical  solution  of  the  Euler-Lagrange  equations 
is  given  in  [74].  The  problem  is  reduced  to  a  second  order  ODE  on  the  constraint 
manifold. 

Finally,  we  mention  a  code  that  has  recently  been  released  for  solving  higher 
index  linear  time  varying  DAEs.  GELD  A  [60]  is  based  on  the  index  one  integration 
method  described  in  Section  1.2.3.  It  can  deal  with  systems  that  do  not  have  unique 
solutions  or  inconsistencies  in  the  initial  values.  A  nonlinear  version  of  this  approach 
is  currently  being  developed. 

1.1.3  Applications 

In  this  section  we  briefly  outline  some  applications  where  DAEs  occur.  In  control 
theory  applications,  it  is  frequently  the  case  that  we  have  a  differential  equation  of  the 
form  F(y',  y,  t,  u)  =  0  where  u  represents  a  set  of  controls.  The  controls  are  applied  so 
that  the  solution  satisfies  some  constraints  g(y,u)  =  0.  Such  problems  naturally  give 
rise  to  DAEs  even  if  the  differential  equation  is  explicit,  F(y',y,t,u)  =  y'  —  f(y,u). 
DAEs  whose  index  is  between  four  and  seven  frequently  arise  in  control  and  mechanics 
applications  [25]. 

Our  first  example  models  the  planar  motion  of  a  ship  loading  crane  as  shown 
in  Figure  1.1.  A  discussion  of  this  model  is  found  in  [25,  44]  and  is  a  classical 
object  of  study.  The  model  can  be  derived  by  applying  Newton’s  law  to  obtain  the 
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Figure  1.1:  Two  Dimensional  Crane. 


differential  equations  and  the  geometric  constraints  of  the  system  give  rise  to  the 
algebraic  equations.  The  crane  consists  of  a  trolley  of  mass  Mi  that  moves  along  a 
linear  horizontal  track.  A  cable  of  length  r,  and  tension  r,  connects  a  winch  on  the 
trolley  to  a  mass  M2.  The  controls  consist  of  an  external  force  Uj  which  moves  the 
trolley  and  a  torque  u2  that  is  applied  to  the  winch.  The  trolley  location  is  d  while 
the  load  location  is  ( x,z ).  The  angle  of  the  cable  with  the  vertical  is  0  and  J  is  the 
moment  of  inertia  of  the  winch.  The  load  is  required  to  follow  a  prescribed  path 

(pi(t),  p2(f)).  The  resulting  index  five  DAE  in  {ui,  u2,  x,  z,  r,  0,  x',  z',  r,  d,  r',  d'}  is 

M2£w  =  —  rsin#  (1.17a) 

M2zw  =  — t  cos  0  + mg  (1.17b) 

M\d"  =  —  C\  d1  +  u\  +  r  sin#  (1.17c) 
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Jr"  =  -C2r'  -C3u2  +  C*r  (1.17d) 

0  =  rsin0  +  d  —  x  (1.17e) 

0  =  rcosO  —  z  (1.17f) 

0  =  x-p!(t)  (l-17g) 

0  =  z  —  p2(t).  (1.17h) 


System  (1.17)  will  be  an  index  six  DAE  if  actuator  dynamics  are  included.  Note  that 
this  DAE  system  is  not  in  Hessenberg  form  since  the  variable  9  in  the  constraints  does 
not  appear  in  differentiated  form. 

The  next  example  is  a  slight  simplification  of  the  equations  for  the  prescribed  path 
control  of  a  two-link,  flexible  joint,  planar  robotic  arm  as  depicted  in  Figure  1.2. 


Figure  1.2:  Two-Link,  Flexible  Joint,  Planar  Robotic  Arm. 


x[  =  X4  (1.18a) 

x'2  =  £5  (1.18b) 

x'3  =  xe  (1.18c) 

x\  —  2c(a:3)(a:4  +  cc6)2  +  d(x3)x \  +  (2x3  -  x2)(a(x 3)  +  2b(x3))  +  a(x3)ux 
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—a(x3)u2  (1.18d) 

x'5  =  -2c(x3)(x4  +  x6)2  -  d(x3)x \  +  (2x3  -  a:2)(l  -  3a(x3)  -  2 b(x3)) 

— a(«3)ui  +  (a(a;3)  +  l)u2  (1.18e) 

x'6  =  -2c(x3)(x4  +  x6)2  -  d(x3)x \  +  (2x3  -  x2)(a(x 3)  -  %{x3))  -  2c{x3)xl 

-(x4  +  x6)2d(x3)  -  (a(x3)  +  b(x3))ui  +  ( a(x3 )  +  b(x3))u2  (1.18f) 

0  =  cosa:i  +  cos(a;i  +  x3)  —  pi(t)  (1.18g) 

0  =  sinxi  +  sin(a:i  +  x3)  —  p3{l)  (1.18h) 

where 


Pl(t)  = 

cos(e‘  —  1)  + 

cos(t  —  1) 

(1.19a) 

to 

<74* 

II 

sin(l  —  e4)  + 

sin(l  —  t ) 

(1.19b) 

a(s)  = 

2 

,  .  cos  s 

^  5  =  2  —  cos2  s 

(1.19c) 

2  —  COS2  5  ’ 

c(s)  = 

sin  s 

,  N  cos  s  sin  s 

^  5  =  2  —  cos2  s 

(1.19d) 

2  —  cos2  s  ’ 

System  (1.18,1.19)  is  also  an  index  five  DAE  in  {xi,...,x&,uuu2}  that  is  not  in 
Hessenberg  form.  Equations  (1.18a)-(1.18f)  model  the  dynamics  of  the  robot  arm 
while  equations  (1.18g)-(1.18h)  specify  the  moving  end  of  the  robot  arm  to  be  on 
the  prescribed  path  (pi(i),p2(t))  [22,  25].  The  angle  of  the  first  link  with  respect  to 
the  x  axis  is  aq,  the  angle  of  the  second  link  to  the  first  is  x3,  and  x2  is  the  rotor 
angle  with  respect  to  the  second  link  which  is  present  because  of  the  flexible  joint 
[22].  The  torques  applied  to  the  first  and  second  joints  are  the  control  variables  u4,u2 
respectively. 

In  Chapter  2,  there  are  a  few  more  examples  of  index  three  DAEs  that  arise 
in  applications  and  will  be  discussed  there.  These  are  a  chemical  reactor  problem, 
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a  trajectory  prescribed  path  control  problem  (TPPC),  and  the  torus  problem  [68]. 
However,  we  wanted  to  illustrate  here  that  models  of  unstructured  higher  index  DAEs 
do  arise  in  applications.  Methods  for  directly  solving  these  systems  is  an  area  of  active 
research. 

1.2  General  DAE  Integrators 

The  general  DAE  integrators  described  in  this  thesis  are  based  on  integrating  an  im¬ 
plicit  ODE  defined  by  the  derivative  array  equations  [24].  Suppose  the  DAE  (1.1)  is 
a  system  of  s  equations  in  the  (2s  +  l)-dimensional  variable  (t/,  y,  t).  We  assume  that 
F  is  sufficiently  differentiable  in  the  variables  (y',y,t)  so  that  all  necessary  differen¬ 
tiations  can  be  carried  out.  In  general,  the  solution  y(t)  of  (1.1)  is  known  to  depend 
on  derivatives  of  F.  If  (1.1)  is  differentiated  k  times  with  respect  to  t,  we  get 

W,V,0  =  0  (1.20a) 

Fy>(y',y,t)y"  +  Fy(y',y,t)y' +  Ft(y',y,t)  =  0  (1.20b) 

£-kr(y',y,t)  =  o.  (i.20c) 

These  s(k  +  1)  equations  are  called  the  derivative  array  equations  and  denoted  by 

G(y',w,y,t)  =  0  (1.21) 

where 

w  =  (y(2),..-,y(fc+1))  • 


We  have 


G  :  e  C  R‘(‘+2,+1  -+  R*<‘+1> 


(1.22) 
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where  it  is  assumed  that  0  is  open.  Often  in  practice  all  of  the  equations  are  not 
differentiated  v  times.  If  k  is  the  maximum  number  of  times  that  any  one  equation 
is  differentiated,  then 

G  :  0  C  R"  ->  Rm  (1.23) 

where  n  =  s(k  +  2)  +  1  and  m  <  s(k  +  1).  The  derivative  array  equations  provide 
information  about  the  solvability  of  the  DAE  (1.1),  the  index,  and  the  dimension  of 
the  solution  manifold  [27].  In  Definition  1.1.2  we  defined  the  index  u  of  the  DAE 
(1.1)  to  be  the  least  integer  k  for  which  (1.21)  uniquely  determines  y'  for  consistent 
(y,t).  If  such  a  k  exists,  then  y'  is  just  a  function  of  (y,i)  so  that 

y'  =  /(s/,0-  (L24) 

In  this  thesis  we  assume  that  the  DAE  (1.1)  is  solvable  in  a  moderate  number 
of  variables  and  that  formulas  are  explicitly  given  for  the  equations  making  up  the 

DAE. 

Before  proceeding,  we  need  one  final  definition. 

Definition  1.2.1  The  matrix  A  in  the  linear  system  Ax  =  b  is  said  to  be  1-full  with 
respect  to  x x  if  there  is  a  nonsingular  matrix  B  such  that 


The  following  proposition  is  proven  in  [68]. 
Proposition  1.2.1  The  following  are  equivalent: 

(i)  A  is  one  full  with  respect  to  X\; 

(ii)  there  is  a  nonsingular  matrix  B  such  that 

I  0 

BA  = 

0  C 
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where  I  is  an  identity  matrix  having  the  same  size  as  X\;  and 

(iii)  if 

Xi 

then  Si  =  0. 

Finally,  we  assume  throughout  this  thesis  that  an  integer  u  in  (1.20)  exists  so  that 
the  following  assumptions  hold. 

(Al)  Sufficient  smoothness  of  G. 

(A2)  Consistency  of  G  =  0  as  an  algebraic  equation. 

(A3)  J  =  [Gyt  Gw]  is  1-full  with  respect  to  y'  and  has  constant  rank  independent  of 
(y',w,y,t) 

(A4)  J/  =  [Gy>  Gw  Gy]  has  full  row  rank  independent  of  (y',w,y,t) 

Conditions  (A1)-(A4)  frequently  hold  in  practice  and  are  verifiable  using  a  combina¬ 
tion  of  symbolic  and  numeric  software  [27,  30].  Additionally,  (A1)-(A4)  are  directly 
in  terms  of  the  original  equations  and  their  derivatives  and  do  not  require  any  sort  of 
coordinate  changes.  (Al)  can  often  be  shown  to  hold  on  an  open  set  by  examining 
the  functions  which  define  the  DAE.  (A4)  can  be  verified  using  standard  numerical 
linear  algebra  routines.  If  (A4)  holds  at  a  point,  then  it  will  hold  in  a  neighborhood 
of  that  point  by  continuity  [30].  Assumption  (A2)  is  easily  checked  by  evaluating  G. 
Assumption  (A4)  is  the  most  difficult  to  verify  requiring  both  numerical  and  symbolic 
computations.  More  discussion  can  be  found  in  [30].  These  assumptions  are  almost 
equivalent  to  a  type  of  uniform  solvability  as  discussed  in  [26]. 

Currently  three  different  general  integrators  are  being  developed.  All  are  based 
on  the  derivative  array  equations  but  differ  substantially  in  their  approach. 


GA f{A) 
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1.2.1  Explicit  Integration  Method 

The  first  general  DAE  method  we  examine  is  discussed  in  [22] .  The  explicit  integration 
method  (El)  approach  consists  of  integrating  (1.24)  by  an  ODE  integration  method. 
In  order  to  do  this  one  needs  to  be  able  to  evaluate  f(yn,tn )  for  a  given  yn,t„.  This 
is  done  by  solving 

G(zn,yn,tn)  =  0  (1.25) 

for  zn  where  £  =  ( y\w ).  Currently  we  solve  G(z,y,t)  =  0  via  a  damped  Gauss- 
Newton  iteration 

z[m+l]  =  z[m]  _  ^  gt yf  t)  G(*H  y>  t)  (1.26) 

where  A^b  is  the  minimum  norm  least  squares  solution  of  Ax  =  b.  Under  assumptions 
(A1)-(A4),  it  has  been  shown  [23]  that  the  iteration  (1.26)  converges  to  a  limit  2*. 
This  limit  satisfies  the  least  squares  equation  (1.27) 

Gl(z,y,t)G(z,y,t)  =  0.  (1.27) 

Note  that  (1.27)  is  not  equivalent  to  G  =  0  but  has  additional  solutions  since  Gz  is  not 
full  row  rank.  However,  y'  is  uniquely  determined  due  to  the  1-fullness  assumption 
(A3).  Also  y'  depends  only  on  (y,t).  Thus  y'  defines  a  smooth  and  unique  completion 

y'  =  7{y,t)-  (L28) 

This  vector  field  (1.28)  will  be  called  a  least  squares  completion.  However  in  the 
remainder  of  this  thesis  we  will  just  refer  to  (1-28)  as  a  completion. 

The  next  theorem  is  proven  in  [23].  Define 

G(z,y,t)  =  G%(z,y,t)G(z,y,t). 

Theorem  1.2.1  Let 

G(z0,y0,to)  =  0,  (1.29) 

and  let  the  following  conditions  be  satisfied  on  an  open  neighborhood  of  (z0,  y0,  f0)  • 
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(i)  Gz  has  constant  rank  k; 

(ii)  Gz  is  1-full  with  respect  to  y' ;  and 

(iii)  [Gz  |  Gy]  has  full  row  rank. 

Define  p  =  rank[Gz  |  Gy]  —  k.  Then  the  following  conditions  are  satisfied: 

(а)  Gz  is  1-full  with  respect  to  y'  for  solutions  of  G  =  0; 

(б)  G  =  0  determines  locally  a  unique  f  such  that  y'  —  f(y,t )  and  {y'Q-,yofio)  lies  on 

the  graph  of  f; 

( c )  there  exists  $  :  Fts+1  — >  IR!'  such  that  solutions  of  G  =  0  satisfy  4>(y,  t)  =  0  and 

®y(yo,to)  has  rank  p;  and 

( d )  both  f  and  $  are  no  more  than  one  order  of  smoothness  less  in  ( y ,  t )  than  G  is 

in  (. z,y,t ). 

Theorem  1.2.1  states,  for  a  fixed  t,  {t/  G  Es  :  $(y,f)  =  0}  is  a  manifold  of  dimension 
s  —  p  and  $_1  ({0})  is  a  manifold  of  dimension  s  —  p  +  1,  which  is  the  solution 
manifold.  $  is  theoretically  determined  by  the  least  squares  completion  and  a  subset 
of  the  derivative  array  equations. 

Prior  research  efforts  [86]  examined  the  use  of  a  different  inverse  other  than  G\  in 
(1.26).  In  order  to  speed  up  the  method  the  reuse  of  Jacobians  was  also  investigated. 
It  is  shown  in  [38]  that  Jacobian  reuse  results  in  discontinuous  completions.  However, 
if  the  Gauss-Newton  iteration  is  solved  with  sufficient  accuracy  the  integration  can 
proceed  and  leads  to  good  numerical  approximations  to  the  solution  of  the  DAE.  A 
comparison  of  the  El  method  with  RADAU5  is  done  in  [54].  Unlike  implicit  Runge- 
Kutta  methods  there  is  no  loss  of  order  in  the  higher  index  variables  with  the  El 
method.  The  use  of  automatic  differentiation  to  form  G  and  Gz  at  each  time  step  has 
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also  been  investigated.  Substantial  savings  in  evaluation  time  and  memory  require¬ 
ments  are  reported  in  [28,  36]  for  moderately  sized  DAEs.  Automatic  differentiation 
will  likely  provide  a  means  to  solve  higher  dimensional  problems. 

One  of  the  problems  with  the  El  method  is  the  tendency  to  drift  off  the  solution 
manifold  since  constraints  are  not  all  preserved.  This  led  to  the  development  of  the 
next  general  method  for  unstructured  higher  index  DAEs. 


1.2.2  Implicit  Coordinate  Partitioning  Method 

The  implicit  coordinate  partitioning  method  (ICP)  was  developed  in  order  to  pre¬ 
serve  both  explicit  and  implicit  constraints  that  occur  in  higher  index  DAEs  [35,  68]. 
A  subset  of  the  state  variables  y  are  used  to  set  up  a  local  set  of  coordinates  for 
the  solution  manifold.  Within  this  local  set  of  coordinates,  an  explicit  integration 
approach  is  used. 

We  take  a  partition  of  the  state  variables 

V  =  (yi,ifc) 


so  that 


J  —  [Gz  Gyi\ 


has  full  row  rank.  The  variables  y2  are  used  to  parameterize  the  solution  manifold 
locally. 

Define  the  variable  z  by 

z  =  {z,yi). 


Thus  z  is  everything  but  y2  and  z  is  everything  but  y.  Then  G~  is  not  only  1-full 
with  respect  to  y'  but  it  is  also  full  row  rank  by  the  choice  of  y\.  Thus 


G~{z,  2/2,  t)G(z,  y2,  t)  =  0 
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is  equivalent  to  G(z,y2,t )  =  0.  Unlike  with  the  explicit  approach,  we  may  now  use 
any  method  of  finding  z  given  ( y2,t )  which  minimizes 

C(z)  =  ^ Cfr(z,y2,t)G(z,y2,t ). 

Again  we  use  a  damped  Gauss-Newton  iteration, 

2[m+i]  =  _  PmGl($m\y2,t)G$m],y2,t)  =  0. 

Suppose  then  that  we  have  a  point  (z0,y20,t0)  where  G(z0,y2o,t0)  =  0  and 
our  assumptions  (Al)-(A4)  hold.  Then  there  is  a  partitioning  such  that  z  = 
(y',[Z,r]],[yi,y2))  and  open  neighborhoods  N  and  N  such  that  within  these  neigh¬ 
borhoods  we  get  that  the  limit  (y[*,  y'£,  w *,  y\)  satisfies  the  fundamental  equations 


y'x  =  fi(y2,t) 

(1.30a) 

y2  =  /2(j/2,f) 

(1.30b) 

y*i  =  g{y2,t) 

(1.30c) 

where  /i,  f2 ,  h  are  defined  by  the  limit  of  the  integration  [35]. 

One  step  of  the  integration  of  the  DAE  is  as  follows.  Given  yn-\ 

—  (j/l,n— 1?  J/2,n— 1 ) 

we  apply  an  ODE  integrator  to  (1.30a)-(1.30b)  to  get  yn  =  (yi,n,  y2,n)-  A  final  function 
evaluation  gives  y'n  =  {y'lin,y2tn)  and  yl  n  =  g(y2,n,tn).  Then  the  value  for  yn  is  taken 
to  be  (yiin,y2,n)-  Thus  yn  lies  on  the  solution  manifold  and  satisfies  all  constraints. 
The  implementation  of  the  ICP  method  is  discussed  in  detail  in  [68]. 

Specific  computational  issues  such  as  Jacobian  reuse  and  partitioning  strategies 
are  further  addressed  in  [86].  The  issue  of  when  to  terminate  the  Gauss-Newton 
iteration  is  also  examined  in  [86].  If  using  a  kth  order  integrator  on  the  completion, 
then  the  iteration  should  be  terminated  at  0(hk)  or  0(hk+1)  accuracy  when  applying 
the  El  and  ICP  methods  respectively  in  order  to  obtain  kth  order  accurate  numerical 


solutions. 
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Figure  1.3  summarizes  the  steps  necessary  to  solve  a  given  DAE  by  the  El  or  ICP 
methods. 

F(y',y,t)  =  0 

G(y',  w,  y,  t)  =  0 
J(y',w,y,t) 

Consistent  Initial  Conditions 
Given  t0,  find  y'0,w0,yo 
Verify  (A1)-(A4) 


Figure  1.3:  Summary  of  General  DAE  Integrators 
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1.2.3  Index  One  Integration 

The  final  method  that  is  related  to  the  El  and  ICP  methods  is  being  developed  by 
Kunkel  and  Mehrmann  [58,  60].  However,  it  is  a  computationally  different  approach. 
To  understand  their  approach,  suppose  that  we  have  a  DAE  F(y',  y,t)  =  0  and  the 
system  of  equations 

g(y,t)  =  0  (1-31) 

describes  the  constraint  manifold  of  F(y',y,t )  =  0  and  gy  has  full  row  rank.  Then 
locally  there  is  a  partition  of  y  =  (yi,J/2)  and  a  subset 

F(y',y,t)  =  0  (1.32) 

of  the  equations  F(y\  y,  t)  =  0  such  that  gV2  and  Fyi  are  nonsingular.  Then  the  DAE 
composed  of  (1.31)  and  (1.32)  is  index  one  and  has  the  same  solutions  as  F(y',y,t)  = 
0. 

The  method  proceeds  as  follows.  Let  Dh  be  the  fc-step  BDF  operator  and  suppose 
t  is  the  current  time.  A  matrix  Z  is  computed  from  the  Jacobian  of  G.  Then  the 
solution  y  of 

ZF(Dhy,y,t  +  h)  =  0  (1.33a) 

G(z,y,t  +  h)  =  0  (1.33b) 

is  taken  as  the  estimate  for  y(t  +  h ).  Note  that  the  equations  (1.33b)  implicitly 
determine  a  relationship  (1.31).  Z  is  such  that 

ZF{y',y,t )  =  0  (1.34a) 

g(y,t)  =  0  (1.34b) 

is  index  one.  Here  (1.34a)  is  the  same  as  (1.32).  Therefore  solving  (1.33)  is  equivalent 
to  solving  (1.34)  by  the  BDF  method.  This  method  does  not  require  constant  rank 
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of  the  Jacobian  of  G  in  a  neighborhood  of  a  solution  to  the  DAE  but  only  on  a 
submanifold  [58]. 

1.2.4  Constrained  Least  Squares 

There  is  another  numerical  approach  based  on  the  derivative  array.  However,  it  is 
not  a  general  method  since  it  requires  some  numerical  structure  to  the  DAE. 

The  Constrained  Least  Squares  (CLS)  method  by  Barrlund  [5,  6]  applies  BDF 
to  the  derivatives  appearing  in  the  derivative  array  equations  and  minimizes  this 
discretization  subject  to  a  subset  of  the  derivative  array  equations.  At  each  step 
of  the  integration  the  DAE  and  some  of  its  derivatives  are  used  as  constraints  to  a 
least  squares  problem  that  corresponds  to  a  multistep,  multiderivative  formula  for 
the  solution.  It  is  required  that  the  user  be  able  to  identity  all  necessary  constraints. 
An  analysis  of  the  stability  properties  of  the  CLS  method  for  linear  DAEs  is  given  in 

[7]- 


1.3  Outline  of  Thesis 

The  purpose  of  this  thesis  to  examine  two  initialization  problems  that  arise  when  nu¬ 
merically  solving  DAEs  by  general  methods.  Resolution  of  these  problems  is  needed 
before  production  quality  codes  can  be  developed.  The  first  problem  is  the  computa¬ 
tion  of  the  consistent  initial  conditions  required  to  begin  the  integration  of  DAEs.  In 
Chapter  2,  various  line  search  strategies  are  examined  and  compared  for  solving  the 
nonlinear  system  G(y',w,y,t)  =  0.  A  good  initialization  strategy  might  also  prove 
beneficial  in  our  integrators  if  step  size  is  being  limited  by  the  iteration  needing  good 
starting  values.  This  could  permit  us  to  take  larger  time  steps  or  to  use  lower  order 
integrators  on  higher  index  DAEs.  Results  of  this  effort  have  been  published  in  [31]. 
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The  second  problem  that  is  examined  is  the  initialization  of  the  iterative  solvers 
used  during  the  integration  of  a  DAE.  In  Chapter  3,  we  discuss  how  linear  multistep 
methods  that  are  used  in  the  integration  of  the  completion  are  implemented  to  provide 
initial  iterates  used  in  the  Gauss-Newton  iteration. 

We  examine  the  case  where  previous  converged  values  of  z  are  used  to  fit  a  polyno¬ 
mial  that  is  used  for  prediction  in  the  Gauss-Newton  solve.  It  is  shown  in  the  linear 
time  varying  case  that  the  z  components  are  actually  approximating  the  solution  to 
another  DAE  which  we  label  the  auxiliary  DAE  or  ADAE.  The  results  of  our  analysis 
indicate  that  constant  and  linear  prediction  appear  feasible  to  implement. 

Chapter  4  summarizes  the  main  conclusions  of  this  research  effort. 

Chapter  5  discusses  areas  of  future  investigation. 

1.4  Contributions  of  Thesis 

The  research  in  this  thesis  has  appeared  in  the  following  publications. 

•  S.  L.  CAMPBELL,  C.  T.  KELLEY  AND  K.  D.  YEOMANS,  Consistent 
Initial  Conditions  for  Unstructured  Higher  Index  DAEs:  A  Computational 
Study,  in  Proc.  Computational  Engineering  in  Systems  Applications,  Lille, 
France,  1996,  pp.  416-421. 

•  S.  L.  CAMPBELL,  R.  HOLLENBECK,  K.  YEOMANS  AND  Y.  ZHONG, 
Mixed  Symbolic-Numerical  Computations  with  General  DAEs  I:  System 
Properties.  Preprint,  1997. 

•  S.  L.  CAMPBELL  AND  K.  D.  YEOMANS,  Behavior  of  the  Nonunique 
Terms  in  General  DAE  Integrators.  To  appear  in  Applied  Numerical 
Mathematics,  1997. 


Chapter  2 


Consistent  Initialization  of  General  DAE 
Integrators 


Obtaining  a  consistent  set  of  initial  conditions  is  perhaps  the  most  difficult  part  in 
determining  a  numerical  solution  of  a  DAE  by  the  general  DAE  methods  described 
in  Chapter  1.  Initialization  is  also  important  in  dealing  with  discontinuities  of  the 
solution  that  frequently  occur  in  applications  [65].  There  has  been  work  on  the 
initialization  of  special  classes  of  DAEs  [17,  20,  57,  63,  69].  However  in  this  chapter 
we  examine  the  problem  of  computing  consistent  initial  conditions  for  the  derivative 
array  based  general  integrators  (El  and  ICP  methods). 

In  initialization  we  will  not  have  good  predictors  for  the  starting  values  particularly 
for  the  higher  derivative  variables  w.  Therefore  we  will  need  to  consider  more  global 
iterative  schemes.  The  globalization  of  iterative  schemes  has  been  extensively  studied 
and  several  approaches  considered  [39,  55].  In  this  chapter  we  examine  various  line 
search  algorithms  that  are  applied  in  the  context  of  our  general  DAE  integrators. 
Unlike  Chapter  3  which  includes  theoretical  results,  this  chapter  consists  primarily  of 
computational  studies. 
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2.1  Algorithmic  Issues 

The  initialization  of  higher  index  DAEs  has  several  features  that  need  to  be  kept  in 
mind  in  the  developing  of  numerical  methods. 

One  option  of  many  ODE  and  DAE  integrators  is  for  the  user  to  specify  an  absolute 
error  tolerance  that  the  solution  is  to  meet.  In  order  to  meet  this  error  tolerance  it 
is  essential  that  the  initial  conditions  be  found  to  a  given  tolerance.  This  means  that 
the  stopping  criteria  must  both  guarantee  that  we  are  near  a  suitable  initial  condition 
and  that  the  approximation  is  within  a  prescribed  error  tolerance.  In  particular,  our 
stopping  criteria  must  insure  both  small  residual  and  small  steps. 

With  initialization  we  expect  to  have  very  poor  initial  estimates  of  many  of  the 
initial  values.  Thus  we  need  a  robust  global  algorithm.  We  assume  we  are  dealing 
with  equations  which  are  known  explicitly.  By  using  automatic  differentiation  codes 
if  necessary  we  can  assume  that  exact  Jacobians,  up  to  round  off  error,  are  available 
at  substantially  less  computational  cost  than  that  of  one  matrix  factorization  [29,  36]. 

It  is  fairly  rare  in  our  experience  to  just  want  an  initial  condition.  Usually  there  is 
a  subset  of  the  variables  which  are  known,  or  for  which  we  have  good  estimates.  The 
number  of  these  variables  could  be  the  same,  less  than,  or  greater  than  the  actual 
degrees  of  freedom  in  the  solution.  If  these  quantities  are  allowed  to  vary  during 
the  iteration,  then  the  final  initial  condition  many  have  greatly  changed  in  these 
components.  Thus  we  want  to  allow  for  the  user  to  specify  a  subset  of  the  variables 
which  are  considered  known.  We  would  like  to  get  a  single  method  that  works  for 
any  reasonable  specification  by  the  user. 

With  any  DAE  integrator  one  usually  needs  not  only  initial  values  of  y  but  also 
y' .  However  with  either  the  El  approach  or  the  ICP  approach  we  have  to  initialize 
the  entire  w  vector  for  the  nonlinear  equation  solver.  For  a  general  unstructured 
problem,  parts  of  the  w  vector  will  be  arbitrary.  We  shall  consider  three  scenarios. 
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The  assumptions  given  on  the  Jacobians  are  based  on  the  theoretical  results  used  in 
developing  the  general  integrators  [27].  Let  z  =  (y',  w,  y ),  to  simplify  our  notation  we 
let  u  denote  whichever  variables  in  z  are  allowed  to  vary  and  we  write  the  derivative 
array  as  G(u).  The  limit  of  the  iteration  will  be  denoted  by  u*  and  J(u )  =  Gu(u). 

Scenario  1  (SI):  We  merely  seek  a  solution  of  the  derivative  array  equations.  No 
components  of  z  are  assumed  known.  The  Jacobian  is  full  row  rank  in  a  neighborhood 
of  u*. 

Scenario  2  (S2):  A  subset  z2  of  z  is  specified.  We  assume  that  the  dim(z2)  is 
less  than  the  dimension  of  the  solution  manifold  of  the  DAE  and  that  the  z2  variables 
are  a  subset  of  local  coordinates  for  the  solutions  of  G(u)  —  0.  The  Jacobian  is  full 
row  rank  in  a  neighborhood  of  u*. 

Scenario  3  (S3):  A  subset  z 2  of  2  is  specified.  We  assume  that  the  dim(^2) 
is  greater  than  the  dimension  of  the  solution  manifold  of  the  DAE.  The  Jacobian  is 
assumed  to  have  constant  rank  in  a  neighborhood  of  the  limit  point  it*.  However, 
the  Jacobian  is  neither  full  row  nor  full  column  rank.  This  scenario  is  also  important 
during  the  integration  of  a  DAE  by  the  El  approach.  We  expect  the  nonlinear  residual 
||Cr(tt*)||  will  be  small  but  possibly  nonzero. 

There  is  a  hybrid  scenario  where  first  a  portion  z\  of  the  variables  are  kept  constant 
and  the  remaining  variables  determined  by  an  iteration.  Then  all  of  the  variables  are 
allowed  to  vary.  This  could  be  modified  in  the  final  stage  so  that  some  of  the  z\ 
components  remain  fixed.  The  idea  is  to  first  reduce  the  error  in  the  less  well  known 
variables  so  that  their  “error”  is  the  same  as  the  error  of  the  better  known  variables. 
Prior  experience  has  suggested  that  this  is  sometimes  advantageous  [33] . 

We  assume  that  the  Jacobian  has  constant  rank  in  a  neighborhood  of  u*.  There 
is  no  reason  to  assume  that  J  has  constant  rank  during  the  iteration.  The  typical 
situation  that  we  consider  is  that  the  Jacobian  is  constant  rank  throughout  its  domain 
except  on  certain  lower  dimensional  manifolds.  During  the  iteration  we  may  pass  near, 
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or  land  on,  these  manifolds. 


2.2  Line  Search  Algorithms 

There  are  three  things  to  be  decided  for  an  iteration;  what  direction  to  move,  A un, 
how  far  to  move  in  that  direction,  and  how  to  terminate  the  iteration.  Our  iterations 
will  take  the  form 

1^71+ 1  —  un  T  pn  A Un  (2.1) 

where  0  <  pn  <  1.  We  will  terminate  the  iteration  if  Au  and  the  residual  ||G(u)||  are 
less  than  tolerances  Ex  and  Er  respectively.  Ex  is  to  insure  sufficient  accuracy  in 
the  solution.  Er  is  to  prevent  prematurely  stopping  the  iteration. 

Gauss— Newton:  The  plain  Gauss-Newton  iteration  uses  Aun  =  —  J\un)G{un) 
and  pn  —  1: 

^n+i  =  un  T^(un)G(un).  (2.2) 

The  theory  for  Gauss-Newton  (like  ordinary  Newton’s  method)  requires  the  starting 
value  to  be  near  the  solution  [9,  10,  52].  The  plain  Gauss-Newton  has  performed 
reasonably  well  in  our  prior  experiments  with  integrators.  However,  as  to  be  expected, 
it  performed  very  poorly  during  our  initialization  tests  with  even  moderately  poor 
initial  guesses. 

There  are  a  variety  of  (minimization)  algorithms  and  some  are  being  examined 
for  use  with  DAEs.  For  example,  sequential  linear  programming  methods  (SLP)  are 
currently  being  examined  for  chemical  engineering  problems  [48].  However,  there  are 
several  reasons  for  wanting  to  utilize  a  method  based  on  a  variant  of  the  Gauss- 
Newton  method.  One  of  the  most  important  for  us  is  that  such  a  method  is  based 
on  information  of  the  type  that  we  are  using  in  the  DAE  integrator.  Also,  in  the 
numerical  integrator  it  would  be  advantageous  to  have  a  more  robust  iteration.  A 
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good  initialization  strategy  might  also  prove  beneficial  in  our  integrators  if  step  size 
is  being  limited  by  the  iteration  needing  good  starting  values.  This  could  permit  us 
to  take  larger  time  steps  or  to  use  lower  order  integrators  on  higher  index  DAEs. 

Damped  Gauss— Newton:  There  are  many  ways  to  pick  pn  in  the  damped 
Gauss-Newton : 

un^-i  =  un  pnJ^  (un,s)G(un'l .  (2.3) 

We  want  the  method  to  revert  to  plain  Gauss-Newton  near  u*.  We  choose  pn  so  that 
the  value  of  a  scalar  test  function  Tn(u)  decreases  on  the  nth  iteration.  There  are  a 
variety  of  line  search  methods.  One  is  to  initially  take  pn  —  1  and  then  halve  pn  until 

Tn(un+i)  <  (1  -  apn)  Tn(un )  (2.4) 

for  a  fixed  a  =  10~4  [39,  55,  56]. 

The  norm  squared  of  the  residual  is  one  natural  choice  of  Tn 

Ifw  =  =  l|G(u)||2.  (Tl)  (2.5) 

There  are  several  reasons  to  expect  that  a  different  test  function  might  be  better. 
First,  because  of  the  absolute  error  tolerances  on  u*  we  do  not  want  to  terminate  the 
iteration  just  on  small  residuals  G(un).  Also,  we  expect  the  derivative  array  equations 
will  sometimes  have  nontrivial  condition  numbers  and  there  may  be  ill  conditioning 
encountered  during  the  iteration.  Condition  numbers  of  104  or  higher  can  be  routinely 
expected.  The  limit  of  the  Gauss-Newton  iteration  satisfies  J^(u*)G(u*)  =  0.  For  ill 
conditioned  problems  the  test  function 

rIV)  =  ||J'(«n)G(«)||2.  (T2)  (2.6) 

has  been  suggested  [40,  41].  The  gradient  of  Tj^\u)  is 


VTW(tt)  =  2GT{u)(j\un))Tj\un)J(u). 


(2.7) 
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At  the  current  point  un  we  have  that  VTT^(un)  is  the  Gauss-Newton  direction. 
Thus  there  is  always  a  p  <  1  which  will  cause  to  decrease. 

Truncated  Gauss-Newton:  If  J  =  UEVT  is  the  singular  value  decomposition 
(SVD)  of  J,  then  define  Sj  to  be  £  with  all  singular  values  below  6  set  to  zero.  Define 
J$  =  UEsVT  and  j\  =  ( Js)l.  This  leads  to  the  iteration 

^n+l  =  Un  (^n ) G(^Un ) .  (2.8) 

In  all  our  calculations  involving  singular  values  the  small  singular  values  are  set  to 
zero  so  we  are  always  using  a  Js  instead  of  J.  However,  our  value  of  5  is  small  so  that 
only  singular  values  which  are  theoretically  zero  but  numerically  nonzero  are  ignored. 
One  could  also  use  larger  values  of  S  as  a  way  to  try  and  counteract  ill  conditioning 
of  J. 

Steepest  Descent  Residual  Minimization:  One  could  try  to  minimize  the 
residual  ||Gr(u)||2  in  the  direction  of  its  gradient: 

=  Un  pn.J  (^n)G(Uji).  (2.9) 

Levenberg— Marquardt:  During  the  iteration  one  can  encounter  or  pass  close 
to  singularities  in  the  rank  of  J.  A  classical  way  to  handle  singularities  during  an 
iteration  is  the  Levenberg-Marquardt  iteration  [39,  51] 

Un+l  =Un-  Pn(JT(un)J(un )  +  £ /)_1  JT (un)G(un)  (2.10) 

with  e  -4  0  as  un  — >  u*.  If  J{u*)  has  full  row  rank,  then  this  method  acts  like 
Gauss-Newton  near  u*.  Away  from  u*  it  acts  likes  j\un)G(un)  in  the  direction  of 
large  singular  values  of  J  and  like  \JT  in  the  direction  of  small  singular  values  of  J. 
Because  we  have  the  full  row  rank  assumption  in  scenarios  SI  and  S2,  the  variant 


Un- (-1  —  Un  pnJ  (Un)(J(Un)J  {Un)  T  ®/)  G('Un) 


(2.11) 
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is  more  appropriate  for  our  intended  application.  In  the  direction  of  large  singular 
values  of  J,  (2.10)  acts  like  (2.9). 

Proposition  2.2.1  Given  the  preceding  definitions  and  assumptions: 

(a)  Jl(un)G(un)  ^  0  if  and  only  if  JT (un)G(un)  0. 

(b)  jj(un)G(un)  0  implies  that  J\un)G(un)  0. 

(c)  If  any  of  —J^G,  —JT(JJT  +  eI)~lG  and  —JTG  are  nonzero  at  a  point  un,  then 

they  all  are  descent  directions  for  (Tl)  and  (T2), 

(d)  If  —JIG  is  nonzero  at  a  point  un,  it  is  a  descent  direction  for  (Tl)  and  (T2). 

Proof:  Part  (a)  follows  from  the  fact  that  Af(J\un))  =  Af(JT(un))  where  A f  repre¬ 
sents  the  null  space  of  a  matrix. 

Part  (b)  is  obvious  since  if  G(un )  ^  Af(J$(un)),  then  it  is  immediate  that  G(un )  0 
Af( J*(un))  since  Af(jl(un))  C  Af(J]{un)). 

To  demonstrate  (c),  we  need  to  show  that  the  inner  product  of  the  gradient  of 
either  (Tl)  or  (T2)  with  any  of  the  given  steps  is  negative.  We  note  that 


VTTW(„„) 

=  2  JT(un)G(un) 

(2.12) 

Vt7?K) 

=  2  JT{un)(j\un))T  j\un)G{un). 

(2.13) 

In  order  to  simplify  the  notation,  it  will  be  understood  that  the  function  G  and  its 
Jacobian  J  are  evaluated  at  the  point  un. 

Assume  that  —  J^G  is  nonzero,  then  using  properties  of  Jt  [32]  we  have 

(VT|1])  (~J'g)  =  -2 GTJJ]G 

=  -2(7T(JJtJ)JtG 

=  -2GT(JJt)T(JJt)G 

=  -2||JJtG||2  <  0. 
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Also  we  have 

(VT^)  =  —2 GTJfTJ1JJfG  =  -2 GFJ^JG  =  -2||  Jf(?||2  <  0. 

Therefore  —  J^G  is  a  descent  direction  using  either  (Tl)  or  (T2). 

Assume  that  —JT(JJT  +  eI)~lG  is  nonzero,  then  for  (Tl) 

(VTW)  +  eI)~lG )  =  -2GtrJJT(JJT  +  eI)~lG 

=  -2GTUtUTGT 


where  S  is  a  diagonal  matrix  whose  elements  are  given  by 

^ 2 

—r1 —  if  (Ti  7^  0,  and  0  otherwise, 
erf -he 

and  (Ti  are  the  singular  values  of  J  computed  from  its  SVD,  J  =  UEVT .  Therefore 

(VTi1])  (- JT(JJT  +  eI)~lG)  =  -2\\UEUtG\\2  <  0 

where  UtjUT  is  the  positive  semi-definite  square  root  of  JJT(JJT  +  el)'1  [62]. 

For  the  test  function  (T2)  we  have 

(VTj2l)  (— JT(JJT  +  eI)~lG)  =  —2 GfrjiTJ'JJT(JJT  -heiy'G 

=  -2(f  JJ\  JJT  +  eI)~xG 

=  —2GtUY,UtG 

=  -2||C/EC/tG||2  <  0 


where  S  is  a  diagonal  matrix  with  diagonal  entries 


if  (Ti  7^  0,  and  0  otherwise, 


and  UEiUT  is  the  positive  semi-definite  square  root  of  J J\J JT  +  el)  Therefore 
—JT(JJT  +  e/)_1(7  is  a  descent  direction  using  either  (Tl)  or  (T2). 
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If  —JTG  /  Owe  have 

(VT]1])  ( -JtG )  =  —2GtJJtG  =  — 2||  JtG\\2  <  0 

and 

(VTi2])  (■ -JtG )  =  -2 GtJ'tJUJtG 

=  -2(?J'T(JU)tJtG 

=  -2  GT(JJt)T(JJt)TG' 

=  —2  Gt{JJ])tJJ]G 
=  -2||JJ+G||2  <  0. 

Again  we  conclude  that  —JTG  is  a  descent  direction  using  either  (Tl)  or  (T2). 

For  part  (d)  assume  —  j\G  ^  0,  then 

(VTW)(-4<j)  =  -2  (fjllG 

=  -2GtJ(JU4)G 

=  -2  Gt{JJI)tJJ\G 

=  —2\\JJgG\\2  <  0. 

A  similar  computation  as  in  part  (c)  for  the  LM  step  —JT(JJT  +  £/)-1Cr  holds  for 
the  test  function  Tj®.  D 

2.3  Numerical  Examples 


In  the  examples  that  follow,  the  Jacobians  were  computed  analytically  in  MAPLE. 
The  Moore-Penrose  inverse  was  computed  by  an  SVD  of  J.  Additionally  we  set 
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a  =  10~4  and  6  =  10-8  unless  noted  otherwise.  8  is  designed  to  ignore  numerically 
zero  singular  values  rather  than  to  regularize  an  iteration  with  small,  but  nonzero, 
singular  values.  Also,  Ex  —  Er  =  10~10.  The  tolerances  were  set  so  that  we  could 
examine  the  long  run  behavior  of  the  iteration.  For  initialization  one  might  well  want 
different  Ex  and  Er  values. 

We  first  started  our  initialization  tests  by  perturbing  a  known  solution  u(to).  We 
generated  several  random  directions  w  scaled  in  a  similar  manner  to  u(to)  by 

(w)i  =  (-1  )Pi  ri  (u(t0))i 

where  p,-  =  1,2  and  .5  <  r,  <  1.  We  then  took  starting  values  at  different  distances 
in  those  directions.  The  initial  point  was  u0  =  u(t0 )  +  lO7-1^  7  =  0, 1, ...  ,4.  Our 
intention  is  to  examine  the  behavior  of  the  iteration  and  the  effect  of  increasingly  poor 
initial  guesses.  We  experimented  with  several  variants  of  the  Levenberg-Marquardt 
method.  Setting  e  =  0  when  the  residual  ||  G^H  became  sufficiently  small  appeared 
to  be  the  best  strategy.  In  what  follows  we  made  the  simple  choice  of  e  =  10-4 
if  ||(7||  >  10-4,  otherwise  e  =  0.  We  also  considered  the  test  function  Tn(u )  = 
||  JT(u)(J(u)JT(u)  +  el)~l  JT (u)G(u) ||  as  well  as  (T2).  The  test  functions  performed 
the  same. 

We  did  a  large  number  of  experimental  runs  with  Scenario  1  (SI).  Plain  Gauss- 
Newton  performed  very  poorly  when  we  had  poor  initial  guesses.  It  is  clearly  not 
practical  for  initialization.  Not  truncating  small  singular  values  when  computing  J^G 
led  to  convergence  problems.  On  the  other  hand  an  aggressive  truncation  strategy 
designed  to  perform  regularization  near  singularities  was  also  less  reliable.  A  small 
but  numerically  nonzero  value  worked  best. 

We  found  that  implementing  Levenberg-Marquardt  by  forming  ( J  JT+eI),  solving 
(, JJT  +  el)y  =  G,  and  then  letting  A u  =  -JTy  frequently  converged  much  more 
slowly  (or  not  at  all),  then  computing  an  SVD  of  J  —  UEVT  and  setting  Au  = 
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— VET (Y2EiT -\-eI)~lUT G .  One  explanation  is  that  there  is  too  much  loss  of  accuracy  in 
the  “normal  equations”  version  of  Levenberg-Marquardt  for  the  problems  considered 
here. 

When  the  iterations  converged,  the  residual  usually  met  its  stopping  criteria  a  few 
iterations  before  Au  did.  On  the  other  hand,  with  particularly  poor  initial  guesses, 
there  were  examples  where  Au  met  the  required  stopping  criteria,  but  we  did  not 
have  small  residuals.  A  combined  stopping  criteria  appears  to  be  required. 

In  the  remaining  discussion  we  will  adopt  the  following  notation  when  discussing 
various  methods  and  damping  strategies: 

PGN  -  Plain  Gauss-Newton 

DGNR  -  Damped  Gauss-Newton,  test  function  (Tl) 

DGNB  -  Damped  Gauss-Newton,  test  function  (T2) 

PLM  -  Plain  Levenberg-Marquardt 

DLMR  -  Damped  Levenberg-Marquardt,  test  function  (Tl) 

DLMB  -  Damped  Levenberg-Marquardt,  test  function  (T2). 

GN  is  any  of  the  tested  Gauss-Newton  methods,  and  LM  is  any  of  the  Levenberg- 
Marquardt  methods. 

2.3.1  Chemical  Reactor  Problem 

The  first  example  is  taken  from  [33]  and  is  an  index  three  DAE  in  the  four  variables 
C ,  R,  T  and  Tc 


C’  +  C  +  R  =  4  +  t  +  t3 

(2.14a) 

T'  +  2T  +  R  +  Tc  =  l  +  e~* 

(2.14b) 

T-1  +  \n(R/C)  =  0 

(2.14c) 
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C  =  cosh(f  —  1).  (2.14d) 

System  (2.14)  models  the  situation  where  C  is  a  specified  product  concentration  and 
we  want  to  determine  the  temperature  Tc  (open  loop  control)  that  will  produce  this 
C.  Each  of  the  equations  in  (2.14)  is  differentiated  three  times  so  that  G  consists 
of  16  equations.  Many  of  the  equations  are  linear  in  the  unknown  variables.  The 
solution  to  (2.14)  is  given  by 


c  = 

cosh(t  —  1) 

(2.15a) 

R  = 

4  +  t  +  t3-C-C' 

(2.15b) 

T  = 

-1/ln  (R/C) 

(2.15c) 

Tc  = 

l  +  e*-*)  -V-2T-R. 

(2.15d) 

MAPLE  was  used  to  differentiate  the  solution  and  evaluate  all  the  equations  at  time 
t  =  0  to  obtain  u(0).  For  7  —  0,1,2  the  GN  methods  converged  typically  in  3, 
4  and  5  iterations.  The  LM  methods  also  converged  but  in  2  to  3  times  as  many 
iterations.  However  for  7  =  3,  PGN  usually  failed.  DGNRand  DGNB  converged  more 
often.  There  did  not  appear  to  be  any  difference  with  the  choice  of  test  function  in 
determining  the  damping  parameter.  The  same  results  occurred  for  the  LM  methods. 
However  the  damped  LM  methods  required  substantially  more  iterations  when  far 
from  a  solution.  Figure  2.1  is  a  plot  of  the  iteration  histories  for  the  damped  GN  and 
LM  methods  for  an  initial  condition  where  7  =  3.  The  undamped  GN  methods  did 
not  converge  in  this  test. 
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Figure  2.1:  Comparison  of  Damping  Strategies  for  Chemical  Problem 

Table  2.1  shows  the  results  of  a  test  which  converged  after  86  iterations  for  DLMB. 
The  GN  methods  did  not  converge.  We  observe  that  the  converged  values  of  some 
of  the  higher  order  derivatives  are  not  close  to  the  initial  values  which  is  expected 
since  these  components  are  not  uniquely  determined.  The  first  column  is  the  initial 
iterate  u0  given  when  7  =  3.  The  second  column  is  the  limit  of  the  iteration  u*.  The 
third  column  is  the  unperturbed  solution  it(0).  The  fourth  column  is  the  absolute 
error  componentwise  between  u*  and  u(0).  The  nonlinear  residual  was  ||(7(«*)|  = 
.151518D  -  14. 

2.3.2  Shuttle  Trajectory  Problem 

The  next  example  is  the  shuttle  trajectory  (TPPC)  problem  described  in  [13].  This 
is  also  an  index  three  DAE  in  seven  variables.  Additionally,  it  is  a  Hessenberg  sys¬ 
tem  which  allows  us  to  reduce  the  number  of  differentiations  to  some  of  the  equa¬ 
tions.  The  equations  are  highly  nonlinear.  We  exploit  the  Hessenberg  structure  and 
differentiate  equations  (2.16a)-(2.16d)  twice,  equations  (2.16e)-(2.16f)  once  and  the 
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Uo 

u* 

«(0) 

|  Error 

C' 

-.211161D+01 

-.117520D+01 

-.117520D+01 

.444089D-15 

R! 

.952947D+00 

.632121D+00 

.632121D+00 

.111022D-15 

V 

.458495D+00 

.127679D+01 

.127679D+01 

.444089D-15 

T'c 

-.118081D+00 

-.642026D+00 

-.642026D+00 

.352052D-12 

~CW 

.307968D+01 

.154308D+01 

.154308D+01 

.222045D-15 

RW 

-.734643D+00 

-.367879D+00 

-.367879D+00 

.555112D-16 

Ji(2) 

-.614925D+01 

-.354368D+01 

-.354368D+01 

.133227D-14 

rp{2) 

1C 

-.136720D+02 

.458258D+01 

-.696186D+01 

.115444D+02 

-.300881D+00 

-.117520D+01 

-.117520D+01 

.444089D-15 

r?> 

.123989D+01 

-.250947D+02 

.563212D+01 

.307268D+02 

T(  3) 

.290901D+01 

.387265D+01 

.154171D+02 

.115444D+02 

rp(3) 

lC 

.374820D+01 

.147837D+02 

.520078D+02 

.372241D+02 

~c^ 

.436052D+00 

.322699D+02 

.154308D+01 

.307268D+02 

RW 

-.567714D+00 

-.567714D+00 

-.367879D+00 

.199834D+00 

r(  4) 

-.946989D+01 

.156564D+01 

-.894741D+02 

.910398D+02 

7^(4) 

1c 

-.873182D+03 

-.873182D+03 

-.466743D+03 

.406439D+03 

c 

.263811D+01 

.154308D+01 

.154308D+01 

.222045D-15 

R 

.192360D+00 

.363212D+01 

.363212D+01 

.444089D-15 

T 

-.188381D+01 

-.116818D+01 

-.116818D+01 

.222045D-15 

Tc 

-.947791D+00 

-.572563D+00 

-.572563D+00 

.111022D-15 

Table  2.1:  Results  of  Initialization  of  Chemical  Problem  using  DLMB  Method 


constraint  (2.16g)  three  times.  The  initial  conditions  are  taken  from  [12,  34]  where 
t0  =  332.867734542.  G  has  20  equations  in  35  unknowns.  It  is  known  that  there  are 
two  values  of  the  control  variable  (bank  angle  /3 )  that  are  close  to  each  other.  Only 
(3  >  0  is  physically  correct. 


H'  = 

Vr  sin  (7) 

(2.16a) 

r  = 

VR  cos(7)  sin(A) 
r  cos(A) 

(2.16b) 

A'  = 

—  cos  (7)  cos  (A) 
r 

(2.16c) 
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Vr  = 


m 


7  = 


D2Er  cos(A)(sin(A)  cos(A)  cos(7)  —  cos(A)  sin(7)) 
L  cos(/3)  cos  (7)  (V& 


mV r 


+ 


Vr 


—  —  g  +  2 £Ie  cos(A)  sin(A) 


^ Er  ^  ^  (sin(A)  cos  (A)  cos(7)  +  cos(A)  sin(  7)) 


A'  = 


Vr 

Lsm(/3)  Vr  ,  \  ,  a\.  n\ 

— — — H - cos (7)  sin(A)  tan(A) 

mvR  cos(7)  r 

—2De  (cos(A)  cos(A)  tan(7)  —  sin(A)) 
flEr  cos(A)  sin(A)  sin(A) 


Vr  cos  (7) 


along  with  a  path  constraint 


£  -  [Co  +  <MVr  -  Vo)  +  C72(Vfi  -  Vo)2  +  (73(Vr  -  Vo)3 


=  0. 


Some  of  the  variables  are  given  in  terms  of  the  state  variables  [33]  as: 


p(H)  =  .002378  exp(— tf/23800) 
CL{a)  =  .84  —  .48(38  —  aCrd)/26 
D  =  .5  pCDSV% 


CD(a)  =  .78  -  .58(38  -  aCrd)/26 
r  =  H  +  ae 


g  =  p/r2 

L  =  .5  pCLSVl 


(2.16d) 


(2.16e) 


(2.16f) 


(2.16g) 


The  following  parameters  and  constants  are  also  given: 

p  =  0.1407653916  x  1017  ft3/s2 
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EIr 

— 

.72921159  x  1(T4  rads/sec 

Co 

— 

D(to)/m  =  3.974960446019 

Ch 

-0.01448947694635 

g2 

— 

-0.2156171551995  x  10-4 

g3 

— 

-0.1089609507291  x  10"7 

de 

~ 

20902900ft 

m 

— 

5964.4965  slugs 

s 

= 

2690  ft2 

Crd 

= 

360/2tt 

OL 

— 

0 

O 

The  shuttle  problem  is  not  well  conditioned  with  singular  values  ranging  from 
approximately  10~4  to  104.  LM  often  did  not  converge  even  for  7  =  0  or  1.  When 
convergence  occurred,  it  usually  required  150-300  iterations.  On  the  other  hand  GN 
took  5-7  iterations  for  7  =  0  and  15-35  iterations  for  7  =  1.  At  7  =  2,  PGN  usually 
failed  while  DGN*  worked  some  times.  However,  the  limit  u*  frequently  had  the 
component  f3  <  0. 

To  illustrate  S2  we  fixed  (3,  (3 '  and  Vr  in  this  example.  However,  the  Jacobian  now 
has  fewer  columns.  While  the  Jacobian  may  still  be  full  row  rank  it  may  be  less  well 
conditioned.  An  examination  of  the  singular  values  of  J  show  that  this  in  fact  occurs 
during  the  iteration  with  singular  values  ranging  from  10-9  to  105.  Table  2.2  shows 
the  result  of  a  test  where  the  DGNB  worked  requiring  27  iterations.  Note  that  the 
components  (3 ,  (3'  and  Vr  did  not  move  from  their  initial  values.  For  this  example  and 
initial  iterate,  the  DGNR  did  not  converge  and  flagged  that  the  Jacobian  experienced 
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a  rank  drop.  The  LM  methods  all  stagnated  at  a  local  minimum.  Again,  the  DGNB 
method  as  discussed  in  [40]  appeared  the  most  attractive  in  this  scenario. 


Uo 

u * 

u(t0) 

|  Error 

H' 

-.292419D+03 

-.318278D+03 

-.318295D+03 

.169781D-01 

c 

.129469D-02 

.121420D-02 

.120518D-02 

.902030D-05 

X' 

.496299D-03 

-.517715D-03 

.525288D-03 

.104300D-02 

Vk 

-.337139D+01 

-.358773D+01 

-.358792D+01 

.191383D-03 

i 

.945038D-04 

.101053D-03 

.101068D-03 

.151924D-07 

A' 

.908818D-03 

.845165D-03 

.833641D-03 

.115243D-04 

P 

-.881555D-02 

-.881555D-02 

-.881555D-02 

.000000D+00 

H 

.243808D+06 

.264039D+06 

.264039D+06 

.349246D-09 

t 

.287535D+01 

.287535D+01 

.310177D+01 

.226412D+00 

A 

.517299D+00 

-.370668D+01 

.559232D+00 

.426591D+01 

VR 

.243171D+05 

.243171D+05 

.243171D+05 

.OOOOOOD+OO 

7 

-.120586D-01 

-.130890D-01 

-.130897D-01 

.698258D-06 

A 

.119394D+01 

-.203833D+01 

.109586D+01 

.313419D+01 

0 

.717332D+00 

.717332D+00 

.717332D+00 

.OOOOOOD+OO 

Table  2.2:  Results  of  Initialization  of  Shuttle  Problem  using  DGNB  Method 


2.3.3  Torus  Problem 

The  following  is  an  index  three  DAE  from  [35]  in  the  seven  state  variables 
{Xi,X2,X3,Ui,U2,U3,A} 


x[  =  Ui 

(2.17a) 

X2  =  U2 

(2.17b) 

x'z  =  u3 

(2.17c) 

u\  =  u3 cos(f)  —  x3sm(t)  —  u2  +  2x\  [^1 

-  r(x\  +  x\)  1/2]  A 

(2.17d) 

u'2  =  u3  sin(t)  +  x3  cos(f)  +  u\  +  2x2  [l 

-  r(x \  +  ®2)_1/2]  ^ 

(2.17e) 
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«3  =  -X3  +  2a;3A 

(2.17f) 

0  =  x\  +  x\  +  x\  -  2 r(x\  +  x\f/2  +  r2  -  p2. 

(2-17g) 

The  solution  is  given  by 

X\  =  [p  cos(27T  —  t)  +  r]  cos  t 

(2.18a) 

x2  =  [p  cos (27 r  —  t)  +  r]  sin  t 

(2.18b) 

X3  =  psin(27r  —  t) 

(2.18c) 

which  lies  on  a  torus.  The  solution  manifold  is  four  dimensional.  In  the  tests  we  set 
p  =  5,  r  =  10.  All  equations  are  differentiated  three  times  so  that  G  has  28  equations. 
Additionally  a  nonlinear  transformation  is  applied  to  the  problem  so  that  it  is  fully 
implicit  [35,  68].  Initial  conditions  were  taken  from  [68]. 

For  this  example,  all  the  methods  were  essentially  the  same  for  7  =  0, 1,2  taking 
4,  4-5,  and  6-9  iterations  respectively.  PGN  failed  at  7  =  3,4.  DGNR  usually  failed 
at  7  =  4  while  DGNB  converged  more  frequently  for  this  value  of  7.  The  damped 
LM  methods  often  converged  when  all  the  GN  methods  failed.  Figure  2.2  displays 
the  results  for  convergence  of  the  LM  methods  when  7  =  4.  However  none  of  the 
GN  methods  converged.  The  conditioning  of  the  problem  was  extremely  poor  as  the 
singular  values  ranged  from  10-5  to  109  during  the  iteration. 


2.4  Continuous  Gauss— Newton  Analogues 

In  has  been  noted  [11,  43,  70]  that  solving  the  square  nonlinear  system  G(u)  =  0  by 
Newton’s  method  is  equivalent  to  integrating 

v!  =  —J~l{u)G{u) 


u(t0)  =  u0 


(2.19) 

(2.20) 
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Figure  2.2:  Iteration  History  of  LM  Methods  for  Torus  Problem 

with  Euler’s  method  and  a  suitable  choice  of  stepsize.  Tanabe  [84]  proved  that  in  the 
underdetermined  case,  trajectories  to  the  differential  equation 

u'  =  -j\u)G{u)  (2.21) 

u(t0)  =  «0,  (2-22) 

where  J(uo)  is  full  row  rank,  will  either  approach  a  stationary  point,  diverge,  or 
stagnate  at  a  point  where  a  drop  in  the  rank  of  J  occurs.  A  stationary  point  u*  will 
be  a  solution  to  G(u)  =  0  that  we  seek.  In  [85],  Tanabe  extended  his  analysis  to  a 
continuous  analogue  of  the  Levenberg— Marquardt  method 

u'  =  -JT(u)(j{u)JT(u)  +  el)G(u).  (2.23) 

It  is  noted  that  the  Levenberg-Marquardt  method  (2.23)  seems  to  have  a  better 
chance  to  converge  to  a  solution  u*  because  it  is  able  to  resolve  rank  drops  that  might 


occur. 
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We  can  also  pose  the  problem  in  terms  of  unconstrained  optimization  by  minimiz¬ 
ing  the  function 

f(u)  =  ^Gt(u)G(u).  (2.24) 

Convergence  results  for  ODE  methods  applied  to  (2.24)  can  be  found  in  [1,  2,  3,  80, 
81,  82].  Additionally  some  numerical  tests  comparing  ODE  methods  with  sequential 
quadratic  programming  algorithms  (SQP)  is  done  in  [14,  15]. 

We  have  not  implemented  integration  methods  to  control  the  damping  parame¬ 
ter.  However,  results  cited  in  these  articles  indicate  that  further  study  of  continuous 
methods  for  initialization  of  our  general  DAE  integrators  might  have  merit. 

We  will  use  this  relationship  between  the  Gauss-Newton  and  continuous  Gauss- 
Newton  methods  in  Section  3.4  of  Chapter  3. 

2.5  Conclusions 

Initialization  by  solving  the  derivative  array  equations  appears  to  be  a  practical  ap¬ 
proach  for  moderately  sized  problems.  As  expected,  some  damping  strategy  is  needed. 
The  damped  Gauss-Newton  appears  most  attractive.  However  the  fixing  of  certain 
known  values,  while  advantageous,  can  have  the  affect  of  increasing  the  condition 
number  of  the  Jacobians  and  increasing  the  likelihood  of  passing  near  singularities. 
The  Levenberg-Marquardt  appears  to  handle  near  singularities  better  but  requires 
considerably  more  iterations.  The  stopping  criteria  must  enforce  both  small  steps 
and  small  residuals. 

The  issue  of  scaling  of  the  Jacobians  has  not  been  examined  and  remains  a  topic  of 
further  research.  Considerable  more  testing  is  needed  with  more  examples,  especially 
larger  dimensional  ones,  and  at  additional  u*  values. 


Chapter  3 


Prediction  Strategies  in  General  DAE 
Integrators 


The  general  DAE  integrators  (El,  ICP  and  101  methods)  require  the  solution  of  the 
nonlinear  system  of  equations 

G{y',w,y,t)  =  0  (3.1) 

at  each  time  step.  For  fully  implicit  problems  this  enlarged  nonlinear  system  will 
have  components  that  are  not  uniquely  determined.  This  poses  questions  about  the 
effects  of  predictors  and  a  possible  instability  in  the  growth  of  these  terms  during  a 
numerical  integration.  In  this  chapter  it  is  shown  that  the  nonunique  components 
are  actually  the  numerical  solution  of  an  auxiliary  DAE  which  depends  not  only  on 
the  original  DAE  but  also  the  predictor  being  used  in  the  Gauss-Newton  iteration. 
The  behavior  of  these  nonunique  components  could  interfere  with  the  convergence 
of  the  nonlinear  equation  solver.  This  is  radically  different  from  the  case  for  ODEs 
and  DAEs  with  special  structure  where  the  predictor  used  for  iterative  solvers  does 
not  directly  affect  the  limit  of  the  iteration.  Finally,  the  “true  solution”  that  is  being 
sought  at  each  time  step  is  a  vector  of  Taylor  coefficients,  except  that  only  some  of 
them  are  found  correctly.  In  [23,  35,  36,  37]  in  order  to  guarantee  at  least  a  first 
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order  prediction  to  start  the  Newton  iteration  it  was  necessary  to  assume  that  the 
integrator  being  used  had  order  higher  than  the  index  of  the  DAE.  For  index  one 
or  two  systems  this  is  not  a  difficulty.  However,  this  requirement  becomes  a  severe 
restriction  when  considering  DAEs  whose  index  is  greater  than  three.  The  results 
on  predictors  will  be  applied  to  the  question  of  developing  low  order  integrators  for 
higher  index  DAEs. 

We  begin  this  chapter  by  reviewing  linear  multistep  methods.  For  the  El  and  ICP 
methods,  linear  multistep  methods  have  been  applied  to  the  ODE  which  is  implicitly 
defined  by  the  derivative  array  equations.  The  justification  for  these  methods  is 
based  on  their  capability  of  achieving  reasonable  accuracy  using  very  few  function 
evaluations  per  forward  step.  The  function  evaluations  are  obtained  by  solving  the 
derivative  array  equations. 

Next  we  illustrate  how  an  Adams-Bashforth-Moulton  (ABM)  method  (predictor- 
corrector  pair)  is  implemented  as  a  one  step  method  via  the  Nordsieck  transforma¬ 
tion.  This  implementation  is  used  to  provide  an  approximation  to  the  higher  order 
derivatives  at  each  step  of  the  integration.  ABM  methods  are  used  in  the  current 
versions  of  the  El  and  ICP  methods  under  development. 

We  briefly  review  polynomial  interpolation  and  its  role  in  providing  predictors  for 
the  nonunique  components  in  the  system  G  —  0.  The  linear  time  varying  case  is 
analyzed  in  detail  with  examples  given  to  illustrate  the  results  derived.  We  briefly 
discuss  the  nonlinear  case  and  the  reuse  of  Jacobians.  Finally  we  both  establish  a  basis 
for  the  design  of  low  order  integrators  for  high  index  DAEs  and  develop  guidelines 
for  the  use  of  predictors  in  integrating  general  high  index  DAEs. 
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3.1  Linear  Multistep  Methods 

We  begin  our  studies  with  a  fc-step  method  applied  to  the  scalar  ODE 

y'  =  f(y,t).  (3.2) 

We  assume  that  the  step  size  h  is  fixed.  Let  yn_;  and  hy denote  the  numerical 
approximations  for  y(tn_;)  and  hy'(tn_i)  respectively,  for  i  =  0, . . . ,  k  —  1.  Arrange 
these  values  in  the  vector 

rp 

Yn  =  [yn  Vn-1  •  •  •  Vn-k+1  hy'n  hy'n_x  •  •  •  hy'n_k+-^  .  (3.3) 

Multistep  methods  find  a  numerical  approximation  for  Yn+i  from  Yn.  The  components 
of  Yn  are  typically  generated  by  a  Runge-Kutta  method  in  order  to  begin  the  iteration. 
For  notational  convenience,  we  assume  that  the  index  of  a  vector  or  a  matrix  starts 
from  0.  Define  the  matrix  B  by 


where  the  omitted  entries  are  zeros.  Let 
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Vn,(  0) 
Vn-1 

^/h— fc-(- 1 

hy'n,(  0) 

%n-l 

K>-fc+ 


so  that  the  parameters  are  constants  such  that 

k 

2/n,(0)  ^  Vn—i  "h  h 

4=1 

(3.7) 

is  an  approximation  to  y(tn ),  and 

k 

Vn,( 0)  =  y™-i  +  h  h  y'n-i) 

i-1 

(3.8) 

is  an  approximation  to  hy'(tn).  The  equation  (3.7)  represents  an 
method  that  will  be  referred  to  as  the  predictor  (P). 

Let  (Yn)i  denote  the  ith  component  of  Yn.  Define 

explicit  multistep 

G(Yn^0),tn)  =  —  (Yn^o))k  +  hf((Yn^0))o,tn) 

(3.9) 

=  ~ ^yn,(0)  "b  ^/(j/n,(0))  tn). 

(3.10) 

Recall  that  y'n  denotes  the  numerical  approximation  to  y'(tn).  We  see  immediately 
that  (3.9)  will  be  identically  zero  if  the  computed  value  of  y  and  /  at  the  current 
mesh  point  satisfies  the  differential  equation  (3.2).  The  computation  of  f(yn,tn )  is 
the  evaluation  (E)  step. 
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Subsequent  iterations  are  given  by 

K,(m+1)  =  Yn,(m)  +  C  G(Yn,{m) ,  tn)  (3.11) 

for  m  =  0, 1, . . where 

Ynt(m)  —  2/n— 1  '  '  '  Dn— fc+1  ^Vn— 1  '  '  fc+l]  '  (3.12) 

and  c  is  a  2/c-dimensional  vector  defined  as 


c  =  0  •  •  •  0  10  •  •  •  0]' 


(3.13) 


and  1  is  the  kth  component  of  c.  Equation  (3.11)  will  be  referred  to  as  the  functional 
iteration  and  will  involve  an  evaluation  (E)  of  /.  This  iteration  may  be  repeated  for 
a  fixed  number  of  steps  or  until  \G\  <<  1.  It  will  be  helpful  to  note  that  after  the 
first  iteration,  we  have 


^2/n,( i)  —  ^2/n,(o)  A  (  4-  h/(yni(o),  tn)) 


(3.14) 


and 


k  k 

Vn,{  1)  =  XX"*  -  >30  7 i)Vn-i  +  h  -  Po  Wn-i 

8=1  8=1 

T  hfto  f{yn, (0) )  fn)- 
After  m  iterations,  we  obtain 

hy'n,(m)  =  hy'n^m_X}  + (—hy'n^m_X}  + hf(yn^m-i),tn)) 
=  hf(i)n^m— i),  fn) 


(3.15) 


(3.16) 


and 


k  k 

J/n,(m)  =  —  0Q  7 i)Vn-i  +  h  J~]{0i  ~  00  ^)Vn-i 

i= 1  s=l 


“t"  h0O f  {Un, (771-1)1  ^n)* 


(3.17) 
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If  h  is  chosen  sufficiently  small  so  that  /dy)\  <  1,  then  (3.17)  converges. 

Also  if  yn,(m)  -»  y»,  then  (3.16)  implies  that  y'n  {m)  -»•  y'n  where  yn,y'n  satisfy 

y'n  =  f(yn,tn).  (3.18) 


Thus  we  have 


k  k 

yn  =  Yjfxlyn-i+hY;  Yn-i+  hPo /(y»»,*n)  (3-19) 

«=1  i=l 

where  a*  =  a;  —  /3q  7,  and  (3*  =  fli  —  (3%  5i  for  i  =  l,...,fc.  Equation  (3.19)  rep¬ 
resents  an  implicit  multistep  method  that  will  be  referred  to  as  the  corrector  (C). 
The  implementations  that  have  been  incorporated  into  the  general  DAE  integrators 
being  developed  at  North  Carolina  State  University  use  a  kth  order  Adams-Bashforth 
predictor  and  a  kth  order  Adams-Moulton  corrector  which  will  be  referred  to  as  an 
ABM  kth  order  pair.  In  practice  only  a  fixed  number  of  functional  iterations  are 
implemented  and  are  summarized  by  the  notation  P(EC)M E.  The  final  function 
evaluation  is 


Yn,(M+i)  =  Ynm  +  ekG(Ynm,tn) 
where  ek  is  a  2fc-dimensional  vector  defined  as 

ek  =  [0  0  •  •  •  0  10  •  •  •  0]T 


(3.20) 
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and  1  is  the  kth  component.  If 


Yn,(M)  ~ 


then 


Yn,(M+l)  — 


and 


Vn,{M) 

Vn-\ 

Vn-k+1 

hy'n,(M) 

hy'n- 1 

hy'n-k+ 1  _ 

Vn,(M) 
Vn- 1 

Vn— fc+l 

hy'n,(M+l) 
hy'n- 1 

hy'n-k+1 


hy'nt(M+ 1)  -  hy'n,(M)  +  (~^y'n,(M)  + hf(yn^M),tn)) 

=  hf{yn,(M)i  In)- 


(3.21) 


(3.22) 


(3.23) 

(3.24) 


The  first  component  yn,(M)  is  unchanged.  Linear  multistep  methods  are  frequently 
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derived  from  Taylor  expansions  or  interpolation  formulas.  The  elements  of  B  and  c 
for  various  ABM  methods  can  be  found  in  [61]. 


3.1.1  Nordsieck  Implementation 

The  idea  behind  the  Nordsieck  implementation  is  to  be  able  to  convert  a  linear  multi- 
step  step  into  a  one-step  method.  This  transformation  makes  it  easier  to  change  the 
step  size.  An  example  of  the  Nordsieck  vector  for  a  kth  order  ABM  pair  is  given  by 

Un 

hy'n 


—v" 

2! 


&V*> 

k\  Un 


(3.25) 


The  kth  order  ABM  method  upon  which  our  implementation  is  based  can  be  formu¬ 
lated  using  the  vector 

r 

Vn 
hy'n 

hy'n„i  (3-26) 


Yn  = 


hy'n-k+l  J 

which  is  a  special  case  of  (3.3).  It  turns  out  that  (3.25)  and  (3.26)  are  related  by  a 
nonsingular  linear  transformation 

Zn  =  QYn  (3.27) 

where  Q  is  chosen  so  that 


Z(tn)  =  QY(tn)  +  0(hk+l). 


(3.28) 
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Q  turns  out  to  be  independent  of  h  [61].  Z(tn )  and  Y(tn)  represent  the  exact  values 
of  Zn,Yn  respectively.  Using  the  formula 

hy'(tn  -  jh)  =  U-Jt1  i  r^)  +  0(hk+l)  (3.29) 

we  get  the  expression 


1 

1 

1 

-1-2 

(— l)2 ' 3  • 

1 

-2-2 

(— 2)2  •  3  • 

•  (-2)fc-J  •  k 

1 

-k  ■  2 

HO2- 3  •• 

••  (-H1  •  * 

(3.30) 


where  the  omitted  entries  are  zeros.  Using  the  transformation  (3.27)  we  have 

Zn,(0)  -  Q  ynt(0) 

=  QBYn-i 

=  QBQ  1  Zn—\  ? 

and 


^n,(m+ 1)  =  Q  ^71,(771+1) 

—  QYn^ rn)  "b  ?  tn) 

:=  “b 

—  ^n,(m)  “b  d(?(Q  ^n,(ra)  7  ^n) 

where  d  =  Qc.  The  final  function  evaluation  is  given  by 

Zn,(M+l)  =  Q  ^n,(M+ 1) 
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=  Qyn,(M)  +  Q^\G(Yn^M),tn) 

—  QYnj(M)  dr  QeiG(Q  1Zn}(M),tn) 

=  Zn,(M)  +  elG{Q  1  Zn,(M),tn). 

The  matrix  QBQ~X  turns  out  to  be  the  Pascal  triangle  matrix  given  by 


QBQ-1  = 


The  implementation  of  these  methods  for  a  system  of  ODEs  will  have 


(3.31) 


^n,(0)  —  ( QBQ  ®  I)Zn- 1 


as  the  prediction  step, 

1)  ^n,(m+ 1)  ®  /)C?((5  ^n,(m)  ?  ^n) 

for  the  correction  steps,  and 


Zn,(M+l)  —  ^n,(M)  T  (®1  ®  Zn^M)iIn) 


as  the  final  function  evaluation.  /  is  an  identity  matrix  whose  dimension  is  the  same 
as  y,  and  <8>  denotes  the  Kronecker  product.  If  A  €  lRmXi  and  B  £  Rnxfc,  then  the 
(right)  Kronecker  product  [62]  A  (g>  B  is  defined  to  be  the  partitioned  matrix 


A®  B  = 


ctuB  (I12B  •  •  •  o, \iB 
CI21B  CI22B  •  •  •  a,2iB 


e  R 


mnxlk 


(3.32) 


*  *  *  ttmlB 
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3.1.2  Prediction  using  Nordsieck  Vector 

The  goal  of  the  general  DAE  methods  is  to  integrate  the  implicitly  defined  ODE  given 
by  the  derivative  array  equations 

G{y',w,y,t)  =  0 

where  w  =  [i /", . . . ,  y^u+1^]  and  v  is  the  index  of  the  DAE.  For  convenience,  this  section 
will  assume  that  we  are  using  the  El  method  for  obtaining  a  solution  to  the  original 
DAE  F(y',y,t )  =  0.  Let  £  =  [y',w\  so  that  we  are  considering  the  nonlinear  system 

G{z,y,t)  =  0  (3.33) 


and  J  =  Gz. 

Integrating  the  implicitly  defined  ODE 


y'  =  f{y,t) 


(3.34) 


requires  that  we  be  able  to  solve  for  y^+1  given  yn+i  i  Ai+i  •  This  is  done  by  solving 
(3.33)  via  a  damped  Gauss-Newton  algorithm 

4”t"  =  4+’i  -  (3-35) 


as  described  in  [22,  23]. 

Prior  work  [68,  86]  assumed  the  the  order  of  the  integrator  k  was  such  that  k  > 
v  +  1.  By  using  the  Nordsieck  transformation  described  in  the  previous  section  to 
integrate  the  completion,  we  have  after  each  step  the  vector 


Zn  — 


Un 

hy'n 


(3.36) 
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where 


— 


and 


bn  — 


2!  yn 


„(«/+!) 


^+2  „(H-2) 
(H-2)!  yn 


^Ly(fc) 

fc! 


(3.37) 


(3.38) 


Define 


m 


21  / 
b?  1 


31  / 

fc3  i 


fi±Dl  r 

fcH-i  y 


(3.39) 


We  can  obtain  the  approximations  of  the  first  u  +  1  derivatives  of  the  solution  by 
premultiplying  (3.36)  by  a  matrix  as  shown  in  (3.40) 


Vn 

hn 

= 

Wn 

I  0 


0  0 
0 

0  0  T(h)  0 


0  1/  0 


Vn 

hy'n 

dn 

~ 

bn 

(3,40) 


The  initial  iterate  in  (3.35)  is  then  given  by 


[0] 

1 

y'n 

4+1  = 

— 

™n+l 

- , 

£ 

e-h 

(3.41) 


The  benefits  of  this  approach  are  that  we  have  at  least  0(h)  approximations  to 
the  “true”  derivatives  of  the  solution  prior  to  beginning  the  Gauss-Newton  solve.  It 
is  hoped  that  this  will  allow  the  numerical  solution  to  remain  within  the  region  of 
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attraction  of  the  solution  manifold  to  the  DAE.  However  for  high  index  problems 
( v  >  3),  this  approach  will  require  using  high  order  ABM  methods.  It  is  well  known 
that  the  stability  regions  for  ABM  methods  shrink  as  the  order  increases  [61].  The  use 
of  high  order  methods  for  high  index  problems  may  force  the  integrator  to  take  small 
steps  which  may  not  be  appropriate  for  the  problem.  We  would  like  the  capability  of 
using  low  order  integrators  on  higher  index  problems  when  it  appears  feasible. 


3.2  Polynomial  Interpolation 


One  traditional  way  to  improve  the  iterative  solvers  is  through  the  use  of  better 
predictors.  We  begin  this  section  by  examining  the  use  of  polynomial  interpolation  in 
initializing  the  iteration  (3.35).  Given  values  zn_;, . . . ,  zn  at  times  . . . ,in,  define 

(t  tn-j ) 


Li,k{t)  ~  U 


J=o  (tn—k  ^n-j) 


(3.42) 


then  the  Lagrange  form  of  the  interpolating  polynomial  is  given  by 

Pi(t)  =  Li,k(t)zn~k •  (3.43) 

k= o 

The  starting  value  of  our  iteration  at  time  tn+ 1  is  given  by  p(tn+i).  For  the  cases 
i  =  0,1,2  we  have  from  (3.43) 

Po(t)  =  Zn 

Pi(t)  =  L\fi{t)zn  +  L\}i(t)zn-\ 

{t  —  tn~i)  (t  —  tn) 

:zn  +  77 - 7~\zn-l 


(tn-tn-l)  (tn-l-tn) 

P2{t )  =  hj2fl(t)zn  +  Z/2,l(t)z„_i  +  L2,2{t)zn-2 

(t  tn-\)(t  tn- 2)  ~  (  ( t  tn)(t  tn—2) 

;zn  1  ~  ;  \7T  I  \Zn—l 


(j'n  ^n—l)(^n  2)  (^n— 1  1  2) 

. _ (t  ~  tn){t  ~  tn-l)  ^ 

{tn- 2  ~  tn)(tn- 2  -  tn-l)  "~2* 
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For  a  fixed  step  size  h ,  these  equations  reduce  to 

Zn+ 1  =  Po(^n+l)  =  Zn 

Zn+1  ~  Pi  {tn+\ )  =  Zn— 1 

Zn+1  =  Piifn+l^  =  3zn  3zn—\  T  Zn— 2- 

If  z{t)  is  sufficiently  differentiable,  then  it  is  easy  to  show  that  for  small  enough  h 
that  \\z(tn+i)  —  pi(tn+ 1)||  =  0(hq)  where  q  is  the  minimum  of  i  +  1  and  the  error  in 
the  already  computed  values  of  zm. 

Before  proceeding  further,  we  will  define  a  couple  of  operators  that  will  prove 
useful.  Let  {( zn,tn )  |  zn  G  n  =  0, 1,2, . . .  and  tn  G  R}  be  a  set  of  equally  spaced 
data  points  in  Rs+1  ( tn  =  t0  +  nh).  Then  the  forward  shift  operator  E  is  defined  by 

Ezn  =  zn^. i,  E  zn  —  E(Ezn)  zn.f-2)  etc. 

We  similarly  define  E  for  negative  exponents,  e.g.  E~zzn  =  zn- 3.  The  backward 
difference  operator  V  is  defined  by  V  =  1  —  E~l.  This  leads  to 

V  zn  =  zn  zn~  1 

V  Zn  =  V (zn  Zn—  1)  =  Zn  ‘lzn—\  “I-  Zn— 2. 

or,  in  general, 

V‘z„  =  (1  -E~')kzn 

=  £(-if  (?)  E-% 

i= 0 

=  D-l)‘  (?)  *»-<•  (3-44) 

t= 0 

If  z(t )  G  Ck+1,  we  have  the  following  result  from  [61] 


V‘z(i„)  =  hkz^(tn)  +  0(hM). 


(3.45) 
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Let  pm(t)  be  the  predictive  polynomial  of  degree  m.  We  will  also  assume  that  the 
step  size  h  is  fixed  so  that  the  predictor  is  given  by 

m 

?»(*»+.)  =  £(-!)'  (”S)  *.-<■ 

i=0 

Lemma  3.2.1  Assume  z  is  sufficiently  smooth.  Then  (^n+i  —  Pm(tn+ 1))  /hm+1  will 

,  (m+1) 

converge  to  z^+1  \ 

Proof:  This  result  follows  almost  immediately  from  (3.44)  and  (3.45). 

m 

Zn+ 1  -  Pm{tn+ 1)  =  *n+l  “  (™+l)  Zn-i 

i= 0 

=  EM'fflw 

i=0 

=  Vm+1£n+i 

=  hm+1zl™ll)  +  0(hm+2). 


□ 


3.3  Analysis  of  Linear  Time  Varying  Case 

Given  the  linear  time  varying  (LTV)  DAE 

E{t)y'  +  F(t)y  =  f{t )  (3.46) 

where  E(t)  is  identically  singular  for  all  t  6  [to,tf],  we  wish  to  approximate  the 
solution  y(t)  to  (3.46)  on  /  =  [to,tj].  It  is  known  [21]  that  conditions  (A1)-(A4)  are 
equivalent  to  (3.46)  being  solvable  for  every  sufficiently  smooth  f(t).  The  theory  for 
numerically  solving  (3.46)  by  either  the  El  or  ICP  methods  is  reasonably  complete 
[19,  21].  Note  that  prediction  is  not  actually  needed  because  the  equations  are  linear. 
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However,  our  goal  in  carefully  examining  the  effect  of  prediction  on  (3.46)  is  to  develop 
insight  on  what  can  be  expected  for  general  nonlinear  problems.  As  we  shall  see  the 
analysis  of  (3.46)  sheds  considerable  light  on  the  more  general  case. 

The  derivative  array  equations  for  (3.46)  are 

Ey'+Fy  =  / 

(£'  +  F)y'  +  Ey"  +  F'y  =  /' 

{E"  +  2F')y'  +  (2E'  +  F)y"  +  Ey'"  +  F"y  =  /" 

(E,n  +  3  F")y'  +  (3  E"  +  3  F")y"  +  (3  E'  +  F)y"'  +  Ey^  +  F"'y  =  /"' 

£;( Ey'  +  Fy )  =  /M 


which  produces  the  singular  system 

A[i,]+By  =  g  (3.47) 

where 

(A)ij  =  (;->)  E^  +  (71)  F^~l\  j<i,  (3.48a) 

(A)^  =  0,  j>i,  (3.48b) 

(w)i  =  y{l+1\  i  =  (3.48c) 

(J B)i  =  i  =  +  (3.48d) 

07)i  =  /(<_1),  *  =  +  l  (3.48e) 

and  (™)  =  0  if  n  >  m. 


The  derivative  array  equations  in  the  time  varying  case  then  take  the  form 


Cz  =  T>u  +  g 


(3.49) 
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where  z  is  y',w  and  maybe  some  part  of  y  and  u  are  any  fixed  state  variables.  In 
the  El  method,  u  =  y,  C  =  A,  and  T>  =  -B,  while  for  the  ICP  method  u  =  y2. 
Let  the  time  values  for  the  integrator  be  tn  and  the  computed  values  of  z,u,g,C,T> 
at  tn  be  denoted  by  use  of  the  subscript  n.  Thus  the  equation  to  be  solved  by  the 
Gauss-Newton  method  is 


Ci+l^n+l  —  Dn+lUn+l  +  <7n+ 1-  (3.50) 

But  the  un  are  a  subset  of  the  state  values  of  the  integrator.  Thus  we  know  that 

un  =  an  +  0{hk+i)  (3.51) 

where  a(t)  is  a  smooth  function.  Therefore  we  can  rewrite  (3.50)  as 

Cn+lZn+l  =  &n+l  +  0(hk+1).  (3.52) 


where  b(t)  =  D(t)a(t )  +  g(t)  is  also  smooth. 

Given  a  prediction  zn+ 1,  the  Gauss-Newton  applied  to  (3.52)  converges  in  one 
step  to 

Zn+1  =  (I  —  Pn+l)Zn+l  +  Ct+lK+l  +  ^1+1  0(h,k+1 )  (3.53) 


where  P  =  C^C  so  that  I  —  P  is  an  orthogonal  projector  onto  Af(C)  —  7?.(Cf)x,  where 
Af  and  7Z  represent  the  null  space  and  range  space  of  a  matrix.  From  (3.53),  we 
can  see  that  the  component  of  z  that  lies  in  7 Z(CT)  is  uniquely  determined  and  the 
prediction  z  is  projected  onto  Af[C).  It  is  instructive  to  note  that 


P  = 


I  0 

0  D(t) 


due  to  the  1-fullness  of  C  and  the  size  of  I  is  at  least  equal  to  the  dimension  of  y. 
Therefore  from  (3.53)  we  have 


y'n+l 

Wn+1 


0 

0 


0 

I  —  Dn+ 1 


1 

Vn+1 

^n+1 

+  ^n+l  bn+1  +  0(hk+1 ) 
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if  we  are  using  the  El  method.  The  point  is  that  y'  is  not  affected  by  prediction,  but 
the  w  components  are  affected  by  prediction  and  C^b. 

Proposition  3.3.1  Suppose  that  the  Gauss-Newton  iteration  is  applied  to  (3.52)  and 
the  underlying  numerical  integrator  of  the  completion  is  of  order  k  >  1,  the  inter¬ 
polation  polynomial  is  pm)  and  the  sequence  zn  converges  as  h  0+  for  fixed  tn  to 
an  m-\- 1  times  differentiable  function  z(t).  Then  z(t )  is  the  solution  of  the  auxiliary 


differential  algebraic  equation  (ADAE) 

(. I-P)z(m+1)  =  0 

(3.54a) 

II 

** 

(3.54b) 

where  b  depends  on  derivatives  of  f  and  the  solution  y  of  the  LTV  DAE  (3.46). 

Proof:  If  we  multiply  (3.53)  by  I  -  Pn+1  and  Pn+i  we  get  the  following  pair  of 
equations 

(I  —  Pn+l)zn+l  =  (I  ~  Pn+l)zn+l  (3.55) 

Pn+i  zn+i  =  Cl+i&n+i  +  Cl+iO(hK).  (3.56) 

Equation  (3.54b)  follows  immediately  from  (3.56).  From  (3.55)  we  have 

(f  P n+ l)(^n+l  zn+l )  0 


so  that 

( I  Pn+  l){zn+l  Pm{fn+ 1))  —  0. 

Dividing  by  h~(m+1)  results  in 

(r  n  \zn+l  ~  Pmifn+l)  _n 

G  -  F^+i)  ffN+i  -  U- 

Holding  tn  fixed  and  taking  the  limit  as  h  — >  0+  gives  (3.54a)  by  Lemma  3.2.1.  □ 
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Proposition  3.3.1  tells  us  that  if  the  Gauss-Newton  iterates  are  converging,  then 
the  limit  must  be  the  ADAE  (3.54).  This  DAE  has  no  recognizable  structure  such  as 
semi-explicit  or  Hessenberg. 


Proposition  3.3.2  System  (3.54)  is  an  index  m  +  1  solvable  DAE.  Its  solutions 
satisfy  the  auxiliary  ODE  (AODE) 


z(m+ 1)  _  p(m+ 

3=0 

(3.57) 

Proof:  This  follows  by  differentiating  (3.54b)  m  +  1  times  to  get 

Pz (m+1>  =  (Ct6)(m+1)  -  £  ("H-1)  p<"»+W)*W 

3=0 

(3.58) 

Adding  (3.54a)  and  (3.58)  gives  (3.57). 

□ 

In  the  rest  of  this  chapter,  we  focus  only  on  the  cases  where  m  =  0, 1  or  2.  For 
later  reference  we  will  be  referring  to  the  following  ODEs  (3.57)  given  by  Proposition 

3.3.2 

1 

II 

(3.59) 

z"  =  (c'b)"  -2P'z'  -  P"z 

(3.60) 

II 

I 

go 

"a 

1 

oo 

"tl 

r 

^3 

(3.61) 

3.3.1  Growth  of  (/  —  P)z 

Recall  that  in  the  general  DAE  integrators  the  Pz  component  of  z  is  uniquely  deter¬ 
mined  by  the  derivative  array  equations.  The  (/  —  P)z  component  is  arbitrary.  The 
actual  performance  of  general  DAE  integrators  will  be  affected  by  how  (/  —  P)z  varies 
since  the  size  of  this  component  impacts  convergence  of  the  Gauss-Newton  iteration 
and  scaling  of  variables  if  they  become  too  large.  Therefore  it  is  of  considerable 
interest  to  know  how  (/  —  P)z  evolves  during  the  integration. 
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Proposition  3.  3.3  Let  z  be  the  solution  of  the  ADAE  (3.54).  Then  0  =  (I  —  P)z 
satisfies  the  differential  equation 

0(m+1)  =  -J2  (m+1)  P(m+1_i)  +  Cib)°) .  (3.62) 

j= o 

Proof:  From  (3.54b)  we  have  z  =  6  +  &b.  Substituting  for  z  in  (3.57)  gives 

0(m+i)  +  (ct&)(m+i)  =  (Cf6)(m+1)  -  53  (mt!)  p("*+1-J)  (0  +  tfby  (3.63) 

3=0 

which  gives  (3.62).  □ 

Equation  (3.62)  is  a  linear  differential  equation.  The  forced  response  due  to  C *  b 
arises  from  the  solution  y  and  the  forcing  function  /  of  (3.46),  or  equivalently,  the 
u,g  in  (3.49).  The  solution  of  the  associated  homogeneous  equation  then  reflects  how 
8  changes  in  response  to  the  varying  of  P  and  the  choice  of  predictor  z. 

Consider  the  homogeneous  DAE 

(I-P)z(m+l)  =  0  (3.64a) 

Pz  =  0.  (3.64b) 

Proposition  3.3.4  Let  0  be  a  matrix  solution  of  (3.64)  of  maximum  rank.  Let  ||  •  || 
be  the  Frobenius  matrix  norm ,  ||Q||2  =  trace(QTQ)-  Then 


>» 

=  0  for  m  =  0 

(3.65) 

=  2||0'||2  for  m  —  1 

(3.66) 

=  3||©/||2  +  c  form  =  2,  c  constant. 

(3.67) 
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Proof:  Equations  (3. 64a, 3. 64b)  imply  that  0'  €  Af(I-P)  and  0  £  J\f(P).  Therefore 
the  columns  of  0'  are  orthogonal  to  the  columns  of  0  or 

0,T0  =  0T0'  =  0. 

Therefore 

(||0||2)  =  trace(0T0'  +  (0')T0)  =  0 

which  proves  (3.65). 

For  m  =  1,  (3.64)  implies  that  the  columns  of  0"  and  0  are  mutually  orthogonal. 
Hence 

(0T0)"  =  (0")T0  +  2(0')T0'  +  0T0"  =  2(0')T0'. 

Taking  the  trace  gives  the  result  (3.66). 

For  m  =  2,  we  have 

(0T0)"'  =  (0"')T0  +  3(0//)t0"  +  3(0,)t0"  +  0T0"'. 

Since  the  columns  of  0'"  and  0  are  orthogonal  we  have 

7  (l!@ll2)  =  3(trace((ero'  +  (e')Te") 

=  3-7-  (trace  ((0,)T0/). 
dt 

Integration  gives  the  result  (3.67)  with  c  =  —  2trace(0(fo)T ©w(to))  +  ||©,(^o)||2-  D 

Similar  equations  for  higher  m  can  be  derived  but  become  more  complicated  and 
are  not  as  easy  to  interpret. 

We  have  demonstrated  that  if  the  sequence  zn  converges  as  h  — >•  0+,  to  a  smooth 
limit,  then  the  limit  satisfies  the  ADAE  (3.54).  We  now  show  that  the  limiting 
behavior  takes  place.  To  address  this  issue,  note  that  if  we  have  an  ith  order  DAE 
with  no  lower  order  derivative  terms 


F(x^\  x,t)  =  0 


(3.68) 
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and  rewrite  it  as  a  first  order  DAE 


F(v'i,x,t)  = 


(3.69a) 


V-,  -  Vi 


(3.69b) 


v'2-v3 


(3.69c) 


vx-v2  = 


(3.69d) 


where  V\  =  x,  then  we  get  the  following  result. 

Lemma  3.3.1  Applying  a  backward  Euler  (BDF-1)  approximation  to  the  first  order 
DAE  (3.69)  is  the  same,  for  n  >  m,  as  using  the  approximation 


F(J%  (^ri+l  Pi— 1  (tn+l  ))>  ®n+l  >  ^n+1 )  0 


(3.70) 


on  the  ith  order  DAE  (3.68). 

Proof:  We  begin  by  applying  backward  Euler’s  to  (3.69) 


F  f  5  5  ^n+1  )  0 


Vi— l,n+l  l,7i 


^i,n+ 1  —  0 


(3.71a) 

(3.71b) 


^2,n+l  —  v2  ,ti 


-  U3,n+1  =  0 


-Va,n+1  =  0. 


(3.71d) 


We  begin  with  (3.71d)  which  we  rewrite  as 


^n+1  % r 


V2,n+1  =  0 


(3.72) 
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(3.73) 

(3.74) 

(3.75) 


Therefore,  the  sequence  zn  is  a  numerical  approximation  to  the  solution  of  the 
solvable  DAE  (3.69).  The  m  =  0  case  gives  us  an  index  one  DAE.  It  is  known  that 
index  one  solvable  DAEs  can  be  integrated  with  BDF-1  [13]. 


3.3.2  Convergence  to  the  ADAE  for  m  —  1,2 

We  will  now  prove  that  BDF-1  on  (3.54)  converges  for  the  cases  m  =  1,2.  Our  interest 
is  in  the  long  term  dynamic  behavior  of  Zk  so  that  we  will  not  be  concerned  with  the 
first  few  values. 

We  have  observed  in  computational  tests  that  if  the  DAE  is  solvable  on  a  compact 
interval,  and  zo  is  fixed,  then  the  Zk  sequence  stays  bounded  independent  of  h  for  h 
small.  This  suggests  the  boundedness  assumption  on  yn  in  Lemma  3.3.3  is  reasonable. 
We  will  use  the  following  result  from  [86]  in  the  proof  of  Lemma  3.3.3. 
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Lemma  3.3.2  If  ||en+i||  <  (1  +  hL)||en||  +  D,  and  0  <  nh  <  b,  then 


INI  <  (i  +  /.i)"|M  +  i)(1  +  ^)*  1 

<  e“||e0||  +  A(eu  -  1).  (3.76) 

Lemma  3.3.3  Suppose  that  we  have  a  linear  time  varying  ODE,  x'  =  A(t)x  +  f(t ) 
where  A,f  are  continuously  differentiable  on  I  =  [a,  b],  and  consider  two  different 
schemes 


^n+1  %r 

h 


+  fn 


which  is  forward  Euler’s  and 


yn+-r—  =  (An  +  hC(tn,  h))  yn  +  fn  +  0(h) 
a 

where  C  is  smooth  in  t,  h.  Suppose  that  for  a  fixed  yo,  there  is  a  constant  ho  such 
that  yn  is  bounded  for  0  <  nh  <  b  and  0  <  h  <  ho.  Let  en  =  yn  —  xn.  Then 

en  =  0(||eo||)  +  0(h). 


A  similar  result  holds  for  backward  Euler. 


Proof:  The  difference  of  the  two  schemes  gives 


®n+ 1 


h 


=  An  en  +  h  C(tn ,  h)yn  +  0(h)  =  An  en  +  0(h) , 


or 

en+i  =  (/  +  hAn)en  +  hO(h).  (3.77) 

Since  Aft)  is  continuous  on  /,  we  have  ||A(*)||  <  L.  By  taking  norms  of  both  sides  of 
(3.77)  we  have 


®n+l  ^  (I  +  hL)\\en\\  +  Kh2 
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where  K  is  a  constant  given  by  the  O-term.  From  Lemma  (3.3.2)  we  have 


<  e“M  +  ^-(e“  -  1) 

=  «“IM|  +  f(e"-l  )h- 


□ 


We  are  now  ready  to  prove  the  following. 

Theorem  3.3.1  Let  m  =  1,2.  Suppose  that  (3.46)  is  a  uniformly  solvable  DAE  on 
the  interval  X  of  length  L.  Assume  that  the  Gauss-Newton  iteration  is  being  used  with 
a  k  order  integrator  and  fixed  step  h.  Suppose  that  for  zq  in  a  neighborhood  of  a  fixed 
zq,  there  is  a  constant  ho  such  that  zn  is  bounded  for  \nh\  <  L.  Then  the  sequence  zn 
is  an  0(h)  approximation  to  a  solution  of  the  ADAE. 

Proof:  From  the  proof  of  Proposition  3.3.1  and  Lemma  3.3.1,  we  note  that  the  zn  are 
the  same  as  a  backward  Euler  applied  to  the  ADAE.  We  now  show  that  the  backward 
Euler  on  the  ADAE  can  be  viewed  as  a  forward  Euler  on  a  perturbation  of  an  ODE. 
Lemma  3.3.3  then  gives  the  needed  result.  We  show  in  detail  the  result  for  m  =  1. 
The  case  for  m  =  2  is  outlined. 

Since  P  is  a  self-adjoint  projection  we  can  write  it  as 

P  =  VTV 

where  V  is  a  full  row  rank  matrix  with  orthonormal  columns.  Similarly 

I-  P  =  UTU 

where  the  rows  of  U  are  orthonormal  and  also  orthogonal  to  the  rows  of  V .  We  may 
assume  that  U,  V  are  smooth  [18,  59].  The  following  relationships  are  immediate 

UUT  =  I 
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VVT  =  I 

UVT  =  0 

U 
V 

For  convenience  let  g  =  V  C*  b  so  that  the  (3.54)  can  be  written,  after  a  permutation 
of  the  equations,  as 


Let 


-y  =  o 

(3.78a) 

ii 

o 

(3.78b) 

hz  =  g. 

(3.78c) 

r  +  UTs  =  q  +  p 

(3.79) 

so  that  r,s  are  independent  variables.  Equation  (3.78c)  gives  r  =  Vz  =  g  which  is 
solved  uniquely  at  each  time  step.  The  key  is  what  happens  to  the  other  component 
s  of  z.  Note  that  (3.78b)  is 


U  ({UT  s)"  +  q")  =  0 


or 


s"  +  2U  ( U'f  s'  +  U  (U")T  s  +  Uq"  =  0. 

Applying  BDF-1  to  (3.78)  we  have 

Zn+ 1  -  Zn 


h 


Vn+l  —  0 


t  t  Un+1  Vn  n 

Un+1 — h —  "  °‘ 


Using  (3.81)  for  yn+ 1  in  (3.82)  gives 

/  Zn+I  ~zn  _  £n  zn—  1 


U, 


n+1 


=  0. 


(3.80) 


(3.81) 

(3.82) 
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If  we  substitute  (3.79)  for  z  and  use  the  fact  that  q  =  VTg  =  VTr  is  known,  then  we 
have 

(Pn+ 1  ~Pn  Pn—pn— 1  \ 

— - ^ - - - 1-  q"+ 1  +  0(h)  J  =  0 

or  equivalently 

Un+1  (f/J+1s„+i  -  2 U^sn  +  Ul_ =  -A2  £/„+,  C+,  +  0(A3).  (3.83) 

Working  with  the  left  side  of  (3.83) 

•Sn+l  2  Un+\  Sn  -f"  dn+1  U n_±Sn—\ 

=  •Sn+i  —  2sn  +  sn_i  +  2  C/n+i  (U^+1  —  U^)sn  +  Un+i(—Un+i  +  £7j_i)sn-i 
=  —  2sn  +  5n_i  +  2t/„+i(C/J+1  —  U^)(sn  —  Sn_i)  + 

t/„+1  (t/J+I  -  2  C/J  +  )s„_, . 


Dividing  by  h 2  we  get 

^n+l  2sn  T 
h? 


+  2  t/n+l  ((C%.J  +  0(h)) 


Sn  $n— 1 

h 


+ 


Un+ .  ((tC,)r  +  0(A))  «„-i  =  -0,+i  Cl  +  0(A) 


or 


sn+1  -  2s 

n  “h  Sn—  1 

A3 


+  2t/„_1((£/J.1)'  +  0(A)) 


1 

h 


+ 


Un—l  {(U'Uf  +  0(fc))  *»-l  =  Un-l  q"+1  +  0(/l) 


which  is  an  0(/i)  perturbation  of  a  forward  Euler’s  applied  to  (3.80).  Thus  if  our 
original  iterative  sequence  is  bounded  independent  of  h ,  then  the  sn  sequence  is 
bounded  independent  of  h.  By  Lemma  3.3.3  this  sequence  must  approximate  the 
numerical  solution  of  (3.80). 


Chapter  3.  Prediction  Strategies  in  General  DAE  Integrators  72 

For  m  =  2,  a  similar  analysis  leads  to 

z'  —  w  —  0  (3.84a) 

w'  —  y  =  0  (3.84b) 

Uy'  =  0  (3.84c) 

Vz  =  g.  (3.84d) 

and  the  ODE 

s'"  +  3  U  ( U')Ts "  +  3/7  (f/")Ts'  +  U(U'")Ts  +  ty"  =  0.  (3.85) 

Applying  BDF-1  to  (3.84)  gives 


CWCi  -  3£?*»  +  3 -  UL ,«.-»)  =  -h3un+w:+1  +  0(h“).  (3.86) 

Again,  we  first  work  only  with  the  left  side  of  (3.86) 

•Sn+l  —  3Un+\U^Sn  +  3 /7n+i/7j_1S„_i  —  /7n+i/7j_2'Sn-2 
=  ^n+l  3  Sn  +  3  Sn_i  &n—2  4”  3  Un^.\  (/7n_j_j  Un  )5n  + 

3t/n+i(-/7j+1  +  t/^K-1  +  /7n+i(/7j+1  -  U„_2)sn-2 
=  ^n+1  3  Sn  +  3  3n_i  $n—2  4“  3  Un+\  {Un+1  Un  )  ^  &n— 1  4"  Sn—2^  + 

3 (/„+,  (t+.  -  2Uf  +  sn_,  +  C/„+1  (-2f/J+1  +  3C/J  -  t/J_2)  s„_2 
“  ^n+1  3  +  3  Sn—2  4“  3  Un+\  (un+1  Un  ^  (5n  2  S71.— 1  +  *Sn— 2)  + 

3  Un+i  (ut+ 1  -  2  Ul  +  C/J_j)  K-1  -  s»-a)  + 

Un+i  (yl+1  -  3  £/J  +  3  -  DJ_2)  sn_2. 

Dividing  by  hz  we  get 

Sn+1  -  3sn  +  3fn-l  -  *n-2  +  3  jj^  +  ~  2^-l  +  ^n-2 


+ 
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3  £/„«  (M+1)T  +  0(h))  s -■  h  s"-2  +  tu.  ((EC.)T  +  0(A))  s„-2 

=  -tWi  Ci  +  0(A) 

or 

*w+1~3*n  +33a"-i-*~-2  +  3  t/n_2  ((C/J_2y  +  0(h))  5w~23w~21+  — 

3  Un—2  ((K_2)T  +  0(h))  ^'X  ~  ^  +  C/n-2  ((C/r2)T  +  0(h))  Sn_2 

=  -Un.2  <'_2  +  0(h) 

and  the  result  follows  as  with  the  m  =  1  case. 


3.3.3  Index  2  Example 

In  order  to  illustrate  the  previous  analysis  we  consider  the  following  index  two  LTV 


system  [46] 

with  solution 

0  0 

1  rjt 

v'  ]  +  [ 1 
2/2  J  [o 

vt  yi 

1  +  n  J  [  2/2 

m' 

0 

(3.87) 

yi(t)  = 

f(t)  +  rjtf(t) 

(3.88) 

yi  (*)  = 

-m- 

(3.89) 

BDF-1  applied  to  (3.87)  results  in 

yi,n  +  Vtn  2/2, n 

=  fn 

(3.90) 

J/l,n  2/l,n-l 

U 

.  ,  2/2, n  - 

+  rjtn 

=T^=i  +  (l  +  ^)2/2,n 

=  0. 

(3.91) 

After  solving  (3.90)  for  yijTl  and  substituting  this  expression  into  (3.91)  for  yi and 
2/i,n—i ;  then  solving  for  y2,n  we  have 

Ul,n  = 


fn  V  112,71 


(3.92) 
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1)2, n 


f]  fn  fn— 1 

1+1) w "  Mi  +  n)  ’ 


Assume  /  =  0  in  (3.87),  then  we  have 


(3.93) 


Dl,n  —  f)  tn  1)2, n 

V 

2/2,71  —  .  2/2,71—1  • 

1  +  T] 


(3.94) 

(3.95) 


As  noted  in  [46]  it  is  required  that  |^|  <  1  in  order  for  y2,n  to  be  stable  (i.e. 
—  1/2  <  rj),  otherwise  y2,n  will  oscillate  and  become  unbounded  as  n  — >  oo.  This 
example  illustrates  where  BDF  methods  applied  to  an  index  two  linear  time  varying 
DAE  can  fail.  However,  the  general  DAE  methods  (El  or  ICP)  can  integrate  (3.87) 
without  difficulty. 

The  derivative  array  equations  for  (3.87)  are 


0 

0 

0 

0 

0 

0 

1 

¥ 

y[ 

"  / 

0 

2/2 

1 

n* 

0 

0 

0 

0 

0 

1  +  v 

ft 

0 

0 

V\ 

1 

rjt 

0 

0 

0 

0 

0 

V 

y'2 

r 

0 

0 

1+2V 

1 

¥ 

0 

0 

0 

0 

iff 

0 

0 

0 

1 

¥ 

0 

0 

0 

0 

iff 

f" 

0 

y\ 

0 

0 

0 

1  +  3?7 

1 

¥ 

0 

0 

0 

0 

2/2  _ 
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We  will  use  the  El  method  in  this  example.  The  system  that  we  will  solve  is 


0 

0 

0 

0 

0 

0 

'  y[  ' 

f  -X 1  -  rjtx2 

1 

¥ 

0 

0 

0 

0 

3/2 

-(1  +  77)0:2 

1 

¥ 

0 

0 

0 

0 

y'[ 

/'  - 

0 

1  +  27? 

1 

¥ 

0 

0 

y'i 

0 

0 

277 

1 

¥ 

0 

0 

y'C 

/" 

0 

0 

0 

1  +  377 

1 

¥ 

*4" 

0 

(3.96) 


We  can  find  a  completion  to  the  least  squares  equations  by  premultiplying  the  equa¬ 
tion  Cz  =  g  (3.96)  by  C+  to  obtain 


CfC  =  Cfg 


(3.97) 


/  0 

0  D(t) 


w 


(3.98) 


When  numerically  solving  (3.96)  the  computation  of  C 1  can  easily  be  obtained 
by  either  an  SVD  or  QR.  In  order  to  examine  the  solutions  given  by  the  El  method 
we  will  compute  an  analytic  generalized  inverse  of  C.  Let  C  =  BC  be  a  full  rank 
factorization  of  C.  Then  it  is  well  known  [47]  that  the  Moore-Penrose  inverse  of  C  is 
given  by 


Cf  =  Ct(CCt)~1(BtB)-1Bt. 


(3.99) 
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P 


where  D  =  pH4  +  9 rj2  +  2 rft2  +  Qrj  +  2.  Notice  that  P  — »■  P*  as  t  — >  oo  where  P*  is 
the  diagonal  matrix  whose  elements  are  {1, 1,0, 1,0, 1}. 


Chapter  3.  Prediction  Strategies  in  General  DAE  Integrators 


77 


C'g  = 


(3.101) 


We  also  have 

-?-W2  +  £+>)</" 

-/" 

(l+27))(2+67;+97)2+7)2t2)/" 
jj4t4+97)2+27)2T2+6»)+2 

?;i(l+27;)(l+?)2f2)/" 

174  <4  +97)2+27?2 12  +67) -f 2 

-^(1+3t?)(1+27?)/// 
r?4 14 +9??2  +2t?2  t 2  +6??+2 

— T72i2(l+3T?)(l+27j)/,/ 

774tf4+9772+27?2<2-f  6??+2 

The  numerical  results  that  follow  integrate  (3.87)  on  [0,20],  rj  =  —3/4  and  f(t)  = 
cos(f). 

The  are  given  by  the  Gauss-Newton  iteration 


zn  — 


Vn 

Wn 


=  (/  -  P)zn  +  c\gn 


(3.102) 


The  y'  components  are  uniquely  determined  (3.101) 

y2  sin(t)  3 


Vi  = 


2  +^^cos(t) 


y'2  =  cos  (<). 


The  other  components  of  0  are  given  by 

9f2(16  +  9t2)uii  +  1 2t  ( 1 6  +  9  t2)w2  +  240 tw3  —  180t2w4  +  8  cos(f)(41  +  9f2) 
Wl  =  656  +  288t2  +  81t4 

12i(16  +  9f2)u)i  +  16(16  +  9 t2)w2  +  320uJ3  -  240tu54  -  6*  cos(*)(16  +  9t2) 
W 2  “  656  +  288 12  +  81t4 

240tvh  +  320w2  +  (400  +  144t2  +  81f4)u53  +  12i(16  +  9 t2)w4  -  120 1  cos (t) 
W3  ~  656  +  288 12  +  81t4 

— 180f2tui  -  240fu52  +  12f(16  +  9t2)u;3  +  16(41  +  9 t2)w4  +  90f2  cos (t) 
m  ~  656  +  288f2  +  81*4 
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All  computations  were  done  on  a  CTX  PC  with  an  Intel  Pentium  120  MHz  pro¬ 
cessor.  The  El  code  was  written  in  FORTRAN  77  and  LAPACK  [4]  was  used  to 
performed  numerical  computations  of  C*. 

Figure  3.1  shows  the  result  of  using  polynomial  interpolation  for  the  cases  m  = 
0,1,2.  A  second  order  ABM  method  with  fixed  step  size  h  =  .05  is  used  in  integrating 
the  completion.  The  top  two  plots  are  the  graphs  of  y[  and  y'2  which  are  unaffected 
by  prediction.  The  next  four  plots  of  Wi,...,W4  demonstrates  essentially  bounded 
behavior  for  m  =  0,  slow  linear  growth  for  m  =  1  and  much  stronger  growth  in  some 
of  these  components  for  m  =  2.  We  observe  this  in  Figure  3.1  for  m  =  0, 1,2. 

Convergence  of  BDF-1  for  the  index  one  case  (m  =  0)  is  clearly  seen  in  Figure 
3.2.  The  y'  components  of  the  z  vector  are  omitted  since  they  are  not  affected  by 
prediction  (assumption  (A3));  only  the  w  components  are  displayed  since  these  values 
change  depending  on  different  prediction  strategies.  The  AODE  (3.59)  is  solved  using 
the  numerical  code  ODE.F  written  by  Shampine  [83]  (obtained  via  NETLIB).  The 
absolute  and  relative  error  tolerances  were  set  to  10-6  in  ODE  and  the  step  size  was 
fixed  at  h  =  .05.  Figure  3.2  shows  that  the  z  components  given  by  integrating  the 


AODE 


z'  =  {c]gy  -  p'z 


and  the  z  components  given  by  the  Gauss-Newton  iteration 


z  =  (I-P)z  +  tfg 


are  approximating  the  same  differential  equation. 

For  the  higher  order  predictors,  m  =  1,2,  we  initialized  the  AODEs  (3.60,3.61) 
at  t  =  5h  using  the  values  of  z  given  by  the  Gauss-Newton  iteration  at  that  time. 
Figures  3.3  and  3.4  show  the  values  of  w  components  given  by  integrating  (3.60)  and 
the  Gauss-Newton  method  respectively.  Again  we  observe  that  the  AODE  is  correctly 
capturing  the  behavior  of  the  z  sequence.  These  figures  also  illustrate  convergence 
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as  h  — »  0+  of  z  to  the  AODE  solution.  Figures  3.5  and  3.6  demonstrate  the  same 
conclusion  for  the  m  =  2  case. 

The  previous  graphs  dealt  with  the  solutions  themselves.  The  theory  developed 
earlier  states  that  it  is  the  (/  —  P)z  term  whose  growth  depends  on  the  choice  of  the 
predictor.  For  time  invariant  DAEs  the  governing  differential  equation  is  #(m+1)  =  0 
and  polynomial  growth  results.  However,  the  example  considered  here  is  not  constant 
coefficient.  An  examination  of  the  derivative  array  shows  that  the  coefficients  of  the 
the  solutions  of  the  derivative  array  equations  are  asymptotically  constant. 

Figures  3. 7-3. 9  show  the  graphs  of  \\(I—P)z\\  for  various  step  sizes  and  m  =  0, 1,2. 
Several  points  are  worth  noting.  First,  asymptotically  the  graph  looks  constant,  lin¬ 
ear,  and  quadratic  for  m  =  0, 1,2  respectively,  as  expected.  Note  the  initial  oscilla¬ 
tion.  While  Proposition  3.3.4  asserts  that  ||0||  is  constant,  || (/  —  P)z\\  is  not  initially 
constant.  However  we  observed  earlier  that  last  four  components  of  the  vector  C^g 
(3.101)  will  decay  to  zero  as  t  increases.  Additionally  the  matrix  P  will  converge  to 
a  constant  matrix. 

Finally,  we  compare  the  global  error  in  each  component  of  the  solution  at  time  tn 
by  computing 

max  |eJin|  =  max  | yj>n  -  %•(*„) |.  (3.103) 

Table  3.1  shows  that  the  prediction  strategy  did  not  affect  the  global  error.  The 
numerical  solution  had  the  same  accuracy  regardless  of  the  prediction  strategy  for 
this  example. 


t—H 

II 

-e 

h  =  .05 

h  =  .025 

h  =  .0125 

max„  |e1>n| 
maxn  |e2)„| 

5.5669D-02 

4.4144D-03 

1.4329D-02 

1.0732D-03 

3.6292D-03 

2.6479D-04 

9.1918D-04 

6.6067D-05 

Table  3.1:  Global  Errors  for  Index  2  LTV,  m  =  0, 1, 2. 


Chapter  3.  Prediction  Strategies  in  General  DAE  Integrators 


Figure  3.1:  z  from  Gauss-Newton  given  by  m  =  0, 1,2  prediction. 


Figure  3.2:  z'  =  (C^b)'  —  P'z  and  w  from  Gauss-Newton,  m  =  0. 
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Figure  3.4:  w  from  Gauss-Newton,  m 
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Figure  3.8:  || (/  —  P)z ||  for  Index  2  LTV,  m  =  1. 
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Figure  3.9:  || (/  —  P)z ||  for  Index  2  LTV,  m  =  2. 

3.3.4  Index  4  Example 

Earlier  we  discussed  that  prediction  could  affect  the  convergence  of  the  Gauss-Newton 
iteration.  This  next  example  illustrates  integrating  an  ill-conditioned  problem  and 
the  effects  of  prediction.  We  also  examine  the  use  of  a  low  order  integrator  on  a 
higher  index  DAE.  We  begin  by  considering  the  linear  time  invariant  DAE 

Nx'  —  x  —  /(f)  (3.104) 


where  (a,  (3  ^  0) 


a 

t 

(3 

sin(t) 

0  1 

>  f(t)  = 

e-< 

0  1 

t 2 

0  1 

te-* 

0  _ 

cos(f) 
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The  block  diagonal  structure  of  N  consists  of  a  nilpotent  matrix  whose  index  is  four 
[32]  so  that  (3.104)  is  an  index  four  DAE.  A  solution  is  given  by 


X\  =  e<y,“  —  t  —  a 

(3.105a) 

X 2  -  ^  l+/?2SmW  l+/?2COSW 

(3.105b) 

£3  =  —  sin(t)  —  te~l  +  e-t  —  2 1 

(3.105c) 

£4  =  cos(f)  —  e-<  +  te~l  —  t2 

(3.105d) 

£5  =  sin(i)  —  te~l 

(3.105e) 

03 

O 

O 

1 

II 

<X> 

H 

(3.105f) 

By  applying  an  orthogonal  time- varying  change  of  coordinates  x  =  U ( t)y  where 

sin2(u;t)  L  cos2(uit)  —L  0  0 

L  —  sin2(o;t)  0  0  L  cos2(a;t) 

0  0  sin2(ud)  L  cos  2(wt)  —L 

m  = 

L  cos2(c ot)  —L  sin2  (cut)  0  0 

cos2(od)  —  L  0  0  —  sin2(u>;t)  —  L 

0  0  L  cos2(u>t)  —  L  sin2(a;i) 

and  L  =  sm(oot)  cos(u;f),  we  construct  the  index  four  linear  time  varying  DAE 

NU{t)y'  +  (NU\t)-U{t))y  =  f.  (3.106) 

whose  solution  is 

y  =  UT(t)x  (3.107) 


and  is  displayed  in  Figure  3.10. 
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We  differentiated  each  equation  in  (3.106)  four  times  so  that  the  derivative  array 
equations  consisted  of  30  equations  in  {y\ y",  y"' ,  y(4),  y(5),  y}.  The  parameters  a,f3 
and  u)  can  be  altered  to  change  the  conditioning  of  the  problem.  The  numerical  results 
that  follow  are  for  the  case  where  a  =  j3  =  —1  and  u>  =  2.  For  these  values  of  a,/? 
and  cu,  the  singular  values  of  C  varied  from  10~4  to  103  at  each  step  of  the  integration. 

Second  and  third  order  ABM  methods,  denoted  by  ABM2  and  ABM3  respectively, 
were  used  to  integrate  the  completion.  Figure  3.11  show  the  growth  of  the  nonunique 
components  || (/—  P)z\\  for  various  step  sizes  and  ABM2.  The  graphs  for  ABM3  were 
similar.  Linear  interpolation  (m  =  0)  was  used  in  predicting  all  values  of  z  at  the  next 
time  step.  We  conclude  that  the  El  method  correctly  computed  a  numerical  solution 
to  the  index  four  DAE  (3.106)  using  a  lower  order  integrator  on  the  completion  and 
the  prediction  strategy  m  —  0.  Tables  3. 2-3. 3  are  the  global  errors  using  ABM2  and 
ABM3.  The  numerical  solution  is  displayed  in  Figure  3.12. 

However,  the  result  was  much  different  for  the  m  =  1  case.  Figure  3.13  shows 
that  || (/  —  P)z ||  blows  up  around  t  =  3.  As  h  decreased  we  observe  that  the  Gauss- 
Newton  failed  to  converge  earlier  in  the  integration.  We  also  attempted  to  use  higher 
order  integrators,  Figure  3.14.  Convergence  still  did  not  occur  when  applying  m  =  1 
prediction  to  all  of  the  z  components. 

In  order  to  examine  the  use  of  the  Nordsieck  vector  to  initialize  the  z  components, 
we  integrated  the  completion  with  different  ABM  schemes.  We  initialized  as  many 
components  of  z  as  possible  given  by  the  integrator.  Remaining  components  were 
initialized  using  prediction.  For  example,  if  we  used  the  integrator  ABM3,  then 
we  would  initialize  the  z  components  corresponding  to  {y',  y",  y1"}  using  the  values 
contained  in  the  Nordsieck  vector.  We  fixed  the  step  size  at  h  =  .025.  Figure 
3.15  shows  again  that  the  m  =  0  case  could  be  successfully  applied  to  components 
for  which  the  Nordsieck  vector  did  not  provide  enough  information  to  initialize  the 
entire  z  vector.  It  is  important  to  note  that  the  numerical  solution  is  not  effected  by 
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prediction.  Whether  we  initialized  all  the  z  components  by  prediction  or  part  using 
the  Nordsieck  vector,  the  numerical  solution  returned  the  same  global  errors. 

However,  we  still  had  failure  in  the  m  =  1  case  when  we  used  the  lower  order 
integrator  ABM2.  Figure  3.16  shows  that  m  =  1  did  work  for  ABM3,  ABM4  and 
ABM5.  Again,  it  should  be  emphasized  that  a  fourth  order  ABM  method  using  a 
Nordsieck  implementation  provides  approximations  to  derivatives  of  the  true  solution 
up  to  the  fourth  order.  In  this  case  the  only  components  in  the  z  vector  that  are 
initialized  by  polynomial  prediction  are  those  corresponding  to  y(B\  ABM5  provided 
approximations  to  all  the  derivatives  appearing  in  the  derivative  array  equations. 
ABM5  also  controlled  the  growth  of  || (/  —  P)z\\  better  than  any  of  the  other  meth¬ 
ods.  Our  conjecture  that  using  the  Nordsieck  vector  for  initialization  helps  the  initial 
iterate  for  the  Gauss-Newton  solve  to  remain  close  to  the  solution  manifold  of  the 
DAE  appears  valid  in  this  example.  Prediction  did  not  affect  the  numerical  solution 
where  convergence  occured.  However,  higher  order  prediction  caused  failure.  Us¬ 
ing  the  Nordsieck  vector  for  initialization  provides  a  safeguard  for  convergence  in  ill 
conditioned  problems. 
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Figure  3.10:  True  Solution  to  Index  4  LTV  (3.106). 
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Figure  3.11:  ||  (/  —  P)z ||  for  Index  4  LTV,  m  =  0,  ABM2. 


6),  m  =  0,  ABM2. 


1,  ABM2. 
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Figure  3.16:  ||(/  —  P)z\\  for  Index  4  LTV,  m  —  1  and  Nordsieck. 


h  =  .1 

h  =  .05 

h  =  .025 

h  =  .0125 

maxn  \eltn\ 

2.3302D+01 

6.3993D+00 

1.4535D+00 

3.3323D-01 

max„  |e2,n| 

1.6569D+01 

4.5085D+00 

1.0223D+00 

2.3430D-01 

maxn  |e3)n| 

3.4310D+00 

1.7658D+00 

4.9174D-01 

1.2429D-01 

maxn  |e4,n| 

1.7861D+01 

5.6336D+00 

1.3829D+00 

3.3104D-01 

maxn  |e5,n| 

2.7166D+01 

8.0756D+00 

1.9080D+00 

4.4649D-01 

maxn  |e6)n| 

1.6160D+01 

5.0587D+00 

1.2323D+00 

2.9364D-01 

Table  3.2:  Global  Errors  for  Index  4  LTV,  m  —  0,  ABM2. 
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h  =  .1 

h  =  .05 

h  =  .025 

h  =  .0125 

maxn  |eijn| 

5.5484D+00 

1.5554D-01 

1.8873D-02 

4.3434D-03 

maxn  |e2,n| 

3.9334D+00 

1.1373D-01 

1.3470D-02 

3.0437D-03 

max„  |e3)„| 

2.3633D+00 

1.7231D-01 

1.0350D-02 

8.1917D-04 

max„  |c4,n| 

5.7546D+00 

2.8564D-01 

1.1906D-02 

2.7959D-03 

maxn  |e5,n| 

7.4058D+00 

2.6152D-01 

1.3183D-02 

4.4531D-03 

maxn  j  e6,«  | 

4.8886D+00 

2.2686D-01 

6.9777D-03 

2.1837D-03 

Table  3.3:  Global  Errors  for  Index  4  LTV,  m  =  0,  ABM3. 


h  =  .  1 

h  =  .05 

h  =  .025 

h  =  .0125 

ma^  |ei,„| 

5.9281D-01 

8.4212D-02 

5.3819D-03 

3.1340D-04 

maxn  |e2,„| 

4.4651D-01 

5.9521D-02 

3.7725D-03 

2.2247D-04 

maxn  |e3)„| 

3.6539D-01 

1.2031D-02 

1.4115D-03 

1.0623D-04 

max^  | 

3.5716D-01 

6.3850D-02 

4.6371D-03 

2.9836D-04 

maxn  |e5>n| 

3.5284D-01 

9.8719D-02 

6.8081D-03 

4.1312D-04 

max,,  |e6i„| 

2.8794D-01 

5.6071D-02 

4.2017D-03 

2.6566D-04 

Table  3.4:  Global  Errors  for  Index  4  LTV,  m  =  1  and  Nordsieck,  ABM4. 


h  =  .1 

h  =  .05 

h  =  .025 

h  =  .0125 

maxn  |ei,n| 

7.5466D-01 

1.2311D-02 

1.2340D-04 

6.1310e-006 

ma^  |e2,n| 

5.3438D-01 

8.5953D-03 

8.9314D-05 

5.7579e-006 

max„  |e3,n| 

1.9715D-01 

5.8098D-03 

1.2323D-04 

6.9728e-006 

maxn  1^4^! 

6.4097D-01 

1.3592D-02 

2.1520D-04 

7.1199e-006 

max^j,  1 65,71) 

9.3373D-01 

1.7077D-02 

2.0312D-04 

6.7074e-006 

max,!  |e6,n| 

5.5461D-01 

1.1753D-02 

1.7759D-04 

3.9276e-006 

Table  3.5:  Global  Errors  for  Index  4  LTV,  m  —  I  and  Nordsieck,  ABM5. 
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3.4  Nonlinear  Case 

In  the  nonlinear  case  we  have  the  derivative  array  equations  given  by  the  nonlinear 
system 

G(z,t)  =  0  (3.108) 

where  J  =  Gz  is  assumed  to  have  constant  rank.  In  Section  2.4  it  was  demonstrated 
that  the  Gauss-Newton  method  applied  to  (3.108)  can  be  viewed  as  the  integration 
of  an  ODE  with  a  suitable  step  size.  Given  a  fixed  i,  solving  (3.108)  for  z  would 
involve  integrating 

^  =  -G\(z(  r,  (),  t)G(z(r,  i).  t)-  (3.109) 

We  shall  view  the  generation  of  the  zn+ 1  iterates  as  a  two  step  process.  First,  we 
predict  zn+ 1  by  polynomial  interpolation.  Second,  we  follow  the  Gauss-Newton  flow 
to  steady  state  zn+i .  Then  in  a  neighborhood  of  the  solution  of  the  DAE  there  exists  a 
function  4>(z,i)  which  is  the  limiting  value  of  the  continuous  Gauss-Newton  starting 
at  z  and  time  t.  By  definition  4>($(z,t),i)  =  4>(z,  t).  Differentiating  with  respect  to 
z  we  note  that  is  a  projection.  In  the  linear  case  we  had  $(z,  t)  =  (/  —  P)z  +  C^b 
where  P  =  C^C  and  $z  =  I  —  P.  Another  property  of  $  by  definition  is 

j'mz,t),t)G{$(z,t),t)  =  0.  (3.110) 

In  the  linear  time  varying  case  this  is  precisely  equation  (3.54b). 

One  step  of  predict  and  iterate  then  becomes 

Zn+l  ~  *&(Pm(tn+l')i  ^n+l)* 

Thus 

0  —  %n+ 1  ^{jPm  (tn+1  )>  fn+1 ) 

=  ^ra+l)  ^(Pm(Wl)i  ^n+l) 

=  ^^(^n+l  >  fn+lX^n+l  Pm{l n+l))  4"  G(||zn^-i  Pm(^n+l)||  ). 
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Assuming  the  limit  exists,  we  divide  by  hm+1  and  get  that  the  limit  as  h  — >  0+  must 
then  satisfy 

$z(z,t)z(m+1\t)  =  0 

which  is  (3.54a).  We  also  have  that  zn+1  is  the  limit  of  the  Gauss-Newton  iteration 
so  that 

J^(zn+i ,  tn+i)G(zn+i ,  tn+i)  =  0 

which  is  the  nonlinear  analogue  of  (3.54b).  Thus  we  have  the  following 

Proposition  3.4.1  Suppose  that  the  continuous  Gauss-Newton  iteration  is  being 
used,  the  underlying  numerical  integrator  of  the  completion  is  of  order  n  >  1,  the 
interpolation  polynomial  is  pm,  and  the  sequence  zn  converges  as  h  —$■  0+  for  fixed  tn 
to  an  m  A  1  times  differentiable  function  z(t).  Then  z(t)  is  the  solution  of  the  index 
m  +  1  auxiliary  differential  algebraic  equation  (ADAE) 

$z(z{t),t)z(m+l\t)  =  0  (3.111a) 

j\z{t),t)G(z(t),t)  =  0.  (3.111b) 


Proof:  It  only  remains  to  show  that  (3.111)  is  index  m+  1.  Equation  (3.111b)  says 
that  z(t)  is  a  fixed  point  of  the  Gauss-Newton  iteration  and  thus  z(t )  =  $(z(f),t). 
Differentiating  with  respect  to  t  we  get  z'  =  §z  z'  +  so  that 

(/  -  §z)z'  =  -$t. 

Differentiating  this  equation  m  times  with  respect  to  t  yields 

(/  -  $,)*(TO+1>  =  V(t,  z,...,  z^).  (3.112) 

Adding  (3.112)  and  (3.111a)  gives  ^(TO+1)  =  T(t,z, . .  .,z^)  and  (3.111)  is  index  m+1. 
□ 
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Proposition  3.4.1  concludes  that  if  the  continuous  Gauss-Newton  converges,  then 
the  limit  will  again  satisfy  an  auxiliary  DAE.  However,  (3.111)  is  nonlinear  and  com¬ 
puting  an  explicit  AODE  that  solutions  to  this  DAE  will  also  satisfy  cannot  be  done 
here.  We  saw  in  the  linear  time  varying  case  that  the  nonunique  components  (/  —  P)z 
satisfied  a  linear  ODE  (3.62)  whose  dynamics  varied  with  the  prediction  strategy.  We 
should  also  expect  that  the  nonunique  components  will  change  based  on  the  choice 
of  predictor  z  in  the  nonlinear  case. 

3.5  Implications 

So  far  in  this  chapter  we  have  carefully  examined  what  happens  for  continuous  up¬ 
dating  of  the  the  Jacobians.  In  practice,  numerical  integrators  try  to  reuse  Jacobians 
as  much  as  possible.  This  can  sometimes  introduce  difficulties  with  DAEs  that  do 
not  occur  with  ODE  integrators  [38].  This  section  will  briefly  discuss  this  problem 
along  with  the  implications  of  the  preceding  analysis. 

An  important  special  case  is  when  the  range  of  P  is  constant.  In  this  case,  the  0 
equation  becomes  0(m+1)  =  0  so  that  0  is  an  mth  degree  polynomial.  Predictors  given 
by  m  =  0, 1  are  safe  to  use  in  this  instance.  The  initial  conditions  for  the  z  equation 
can  be  made  smaller  by  solving  the  Gauss-Newton  iteration  to  higher  accuracy.  For 
moderate  length  time  intervals  and  well  conditioned  problems  m  =  2  could  possibly 
be  used. 

In  actual  implementations  of  a  general  integrator  one  wants  to  reuse  Jacobians 
as  much  as  possible  [37,  38].  A  slight  modification  of  the  preceding  analysis  can 
be  applied  during  the  time  that  the  Jacobian  is  constant  to  again  get  0(m+1)  =  0. 
However  the  size  of  the  initial  values  of  0  are  based  not  only  on  the  iteration  but 
also  the  updating  [38]  so  that  these  values  cannot  be  assumed  to  be  very  small. 
Accordingly  m  =  0, 1  are  better  choices. 
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The  analysis  in  this  chapter  has  another  important  consequence.  In  previous 
discussions  of  general  DAE  integrators  it  was  assumed  that  the  order  of  the  integrator 
was  greater  than  the  index.  This  was  done  so  that  the  Nordsieck  vector  could  be 
used  for  a  predictor  in  the  Gauss-Newton  iteration.  This  allows  for  the  integration 
of  many  important  DAEs  but  has  the  disadvantage  that  it  requires  the  use  of  high 
order  integrators  for  moderately  indexed  DAEs  and  rules  out  the  integration  of  DAEs 
of  indices  much  above  four. 

The  fact  that  the  arbitrary  terms  terms  also  satisfy  a  differential  equation  means 
that  one  can  relax  this  assumption.  This  is  important.  One  can  use  low  order 
integrators  on  higher  index  DAEs  provided  that  polynomial  prediction  is  used  for  the 
w  terms. 

We  have  shown  that  the  growth  of  the  nonunique  terms  in  the  the  Gauss-Newton 
solver  as  part  of  a  general  DAE  integrator  satisfies  an  auxiliary  DAE  which  depends 
on  the  predictor.  The  choice  m  —  1  appears  safest,  but  m  =  0  may  have  to  be  used 
on  hard  problems.  The  choice  m  —  1  will  speed  up  the  Newton  iteration  somewhat 
and  avoids  too  rapid  a  growth  of  the  0  terms  as  described  for  the  first  time  in  this 
chapter.  The  use  of  integrators  of  order  less  than  the  index  has  been  justified  provided 
they  are  combined  with  predictors  in  the  El  and  ICP  methods.  These  conclusions  are 
also  expected  to  hold  for  IOI  based  integrators. 
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Current  methods  available  in  codes  for  the  numerical  integration  of  DAEs  are  limited 
to  systems  of  low  index  (u  <  1)  or  systems  having  a  special  structural  form  such  as 
being  Hessenberg.  These  methods  are  also  known  to  suffer  from  a  loss  of  accuracy 
in  the  higher  index  variables.  Methods  discussed  in  this  thesis  are  being  developed 
to  integrate  fully  implicit,  higher  index,  unstructured  nonlinear  DAEs.  While  these 
general  DAE  integration  methods  will  not  likely  replace  existing  codes  for  index  one 
problems,  they  can  be  applied  to  problems  where  existing  methods  are  known  to  fail. 

The  goals  of  this  thesis  were  to  examine  ways  to  make  these  general  methods 
more  numerically  robust.  One  area  of  investigation  was  obtaining  consistent  ini¬ 
tial  conditions  for  the  derivative  array  equations.  We  cannot  expect  to  have  good 
approximations  to  the  higher  order  derivatives  at  the  beginning  of  the  integration. 
We  investigated  line  search  algorithms  as  a  means  to  globalize  the  nonlinear  equation 
solvers.  We  conducted  computational  studies  which  conclude  that  damping  strategies 
are  appropriate  in  determining  consistent  initial  conditions. 

A  second  area  of  investigation  was  the  initialization  of  the  Gauss-Newton  method 
by  polynomial  interpolation  at  each  step  in  the  integration  of  the  completion.  Using 
a  fixed  step  size,  we  have  proven  that  the  undetermined  components  in  the  derivative 
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array  equations  will  satisfy  an  auxiliary  DAE  that  depends  on  the  predictor.  Addi¬ 
tionally  we  examined  the  growth  of  these  nonunique  components  in  the  linear  time 
varying  case.  We  have  demonstrated  the  use  of  low  order  integrators  when  combined 
with  polynomial  prediction  in  the  integration  of  higher  index  DAEs.  The  develop¬ 
ment  of  both  the  El  and  ICP  methods  will  incorporate  the  reuse  of  Jacobians  due  to 
computational  cost  and  efficiency.  In  this  case,  prediction  can  be  safely  applied. 

In  summary,  we  have  examined  various  globalization  strategies  for  computing  con¬ 
sistent  initial  conditions.  It  is  anticipated  that  these  strategies  can  also  be  applied 
during  the  integration  of  the  DAE  in  order  to  provide  a  more  robust  code  by  allowing 
larger  step  sizes  when  appropriate.  However  this  issue  is  still  to  be  examined.  Ad¬ 
ditionally  we  have  examined  polynomial  interpolation  applied  to  the  initialization  of 
the  Gauss-Newton  iteration  during  the  integration  of  the  completion. 

A  detailed  list  of  the  contributions  of  this  research  effort  are 

•  investigated  various  line  search  strategies  to  globalize  iterative  solver  for 
computing  consistent  initial  conditions  (Chapter  2); 

•  implemented  these  methods  in  FORTRAN  and  conducted  numerical  tests 
to  verify  utility  of  various  strategies; 

•  implemented  polynomial  interpolation  to  provide  different  predictors  for 
iterative  solver  (Section  3.2); 

•  for  a  fixed  step  size  and  using  Lagrange  polynomials,  proved  that  limit 
of  the  Gauss-Newton  solve  satisfied  an  auxiliary  DAE  (Sections  3.3  and 
3.4); 

•  proved  solutions  to  ADAE  satisfied  a  linear  auxiliary  ODE  in  the  linear 
time  varying  case  (Proposition  3.3.2); 

•  proved  nonunique  components  from  Gauss-Newton  satisfied  differential 
equation  whose  behavior  is  affected  by  predictor  used  (Propositions  3.3.3 
and  3.3.4); 
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•  proved  convergence  of  BDF-1  applied  to  the  auxiliary  DAE  in  the  m  —  1,2 
cases  (Theorem  3.3.1); 

•  conducted  numerical  tests  to  verify  analysis  of  predictors  applied  to  linear 
time  varying  DAEs  (Sections  3.3.3  and  3.3.4); 

•  examined  a  mixed  strategy  using  Nordsieck  vector  and  prediction  (Section 
3.3.4). 


Chapter  5 

Future  Research 


While  much  progress  has  been  made  towards  the  development  of  general  numerical 
methods  for  the  integration  of  fully  implicit  higher  index  DAEs,  there  remains  several 
areas  of  investigation.  This  section  will  briefly  describe  some  of  these  important 
research  issues. 

It  is  anticipated  that  the  final  versions  of  the  El  and  ICP  codes  will  be  variable 
step,  variable  order  versions.  Work  has  been  done  on  a  variable  step  version  of  the  El 
code.  It  has  been  observed  elsewhere  that  the  local  truncation  error  of  this  method 
is  not  identical  to  that  of  the  numerical  integration  scheme  on  which  it  is  based. 
Additional  research  is  required  to  estimate  the  local  truncation  error.  Strategies 
for  determining  when  to  change  step  size  and  order  to  preserve  the  stability  of  the 
method  need  to  be  developed.  Incorporating  the  globalization  strategies  that  we  used 
in  computing  consistent  initial  conditions  also  needs  to  be  investigated. 

The  El  and  ICP  methods  currently  employ  linear  multistep  methods  for  integrat¬ 
ing  the  completion.  These  methods  are  known  to  have  bounded  regions  of  absolute 
stability  and  are  generally  unsuitable  for  solving  stiff  ODEs.  There  needs  to  be  a 
mechanism  for  determining  when  the  completion  is  stiff.  If  stiff  methods  are  em¬ 
ployed  in  the  integration  of  the  completion,  then  the  Jacobian  or  an  approximation 
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of  the  right  hand  side  of  the  differential  equation  implicitly  defined  by  the  derivative 
array  equations  is  required.  To  date,  no  other  methods  for  integrating  the  completion 
have  been  implemented. 

The  linear  algebra  required  to  solve  the  least  squares  problem  in  the  Gauss- 
Newton  iteration  is  a  major  computational  cost  in  these  methods.  The  Jacobian  of 
the  derivatives  array  equations  have  a  block  triangular  structure  that  can  be  exploited 
by  faster  least  squares  solvers.  The  theoretical  work  that  would  need  to  be  completed 
would  include  showing  that  the  solver  works,  the  Gauss-Newton  iteration  converges, 
and  meaningful  completions  can  be  obtained. 

The  cost  of  constructing  the  derivative  array  equations  with  computer  algebra 
packages  is  high  and  limited  to  systems  of  moderate  dimension.  Although  automatic 
differentiation  has  been  studied  for  evaluating  the  derivative  array  equations  and  its 
Jacobian,  this  technology  has  not  been  implemented  into  any  of  the  general  methods 
for  integrating  DAEs.  This  might  just  involve  rewriting  the  codes  in  C  and  incor¬ 
porating  ADOL-C  for  obtaining  the  derivative  array  equations  and  the  associated 
Jacobian.  Parallel  implementation  of  these  methods  for  large  problems  may  also  be 
required  and  needs  to  be  investigated. 

Several  other  possibilities  can  be  examined  which  might  give  estimates  on  the 
higher  derivatives  contained  in  the  derivative  array  equations.  One  is  to  fit  a  polyno¬ 
mial  to  converged  values  of  the  state  variables.  By  differentiating  this  interpolator 
polynomial  we  can  obtain  approximations  to  the  derivatives  of  the  solution.  Another 
possibility  is  to  use  differencing  to  obtain  such  approximations.  Additional  differen¬ 
tiations  of  the  original  DAE  so  that  it  is  one  full  with  respect  to  some  of  the  higher 
order  derivatives  might  also  be  beneficial  in  this  area.  Again,  further  research  is 
needed  in  this  area. 
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